GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on : 



February 7, 2002, 11:06:26 ; Search time 3842.15 Seconds 

(without alignments) 
1854.894 Million cell updates/sec 



Title: 

Perfect score : 
Sequence : 



US-09-394-745-6514 
432 

1 gtccagcagctcggacttac attttctttttttttcttgg 4 32 



Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



1472140 seqs, 8248589755 residues 



Total number of hits satisfying chosen parameters: 



2944280 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : GenEmbl : * 

1 : gb_ba : * 

2: gb__htg:* 

3: gb_in:* 

4 : gb_om : * 

5 : gb_ov : * 

6 : gb_pat : * 

7 : gb_ph : * 

8: gbjpl:* 

9 : gb_pr : * 



10 


gb__ro : * 


11 


gb sts:* 


12 


gb_sy : * 


13 


gb un : * 


14 


gb vi : * 


15 


em ba : * 


16 


em fun:* 


17 


em hum: * 


18 


em in : * 


19 


em om : * 


20 


em or : * 


21 


em ov : * 


22 


em pat : * 


23 


em ph : *■ 


24 


em pi : * 


25 


em_ro : * 


26 


em_sts : * 


27 


em_sy : * 



I 





em 


un : * 




em 


vi : * 


^n 


Gill 


h i* n n Hum* 
11 LyU i 1 U1U . 


31 


em_ 


_htgo_inv: * 


32 


em_ 


_htgo_rod: * 


33 


em 


_htg_hum: * 


34 


em_ 


_htg_inv: * 


35 


em 


_htg_rod: * 


36 


em 


htg_other : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 
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31 
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30 


8 
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8 
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5 
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18 


8 
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8 


AF14 04 8 6 


AF140486 


Oryza sat 




6 


69 


16 


0 


1373 


8 
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Hemerocal 


c 


7 


68 . 8 


15 


9 


165909 


2 


AP003711 


AP003711 


Oryza sat 




8 


60 . 6 


14 


0 


1652 


8 


D78607 


D78607 Arabidopsis 
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13 


8 
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6 
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Sequence 


c 


10 


59 . 2 


13 


7 
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ATF6G17 
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Arabidops 


c 


11 
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13 
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Arabidops 
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15 
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13 
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8 
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16 
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13 


1 
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8 
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17 


56.6 


13 


1 


1674 


8 


HTCYP81C 


AJ000477 


Helianthu 




18 


56.6 


13 


1 


1719 


8 


HTCYP81L 


AJ000478 


Helianthu 


c 


19 
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12 


6 


138858 


8 


AP002968 


AP002968 


Oryza sat 


c 


20 
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12 


6 


156393 


8 
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Oryza sat 




21 
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6 
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8 
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Glycyrrhi 




22 
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12 


3 


4352 


8 


ZMCP7 1C1G 


X81828 Z 


mays CYP7 




23 


52.6 


12 


2 


72415 


2 


H0102C09 


AL442103 


Oryza sat 


c 


24 


52.4 


12 


1 


163055 


2 


AP003626 


AP003626 


Oryza sat 




25 


51.6 


11 


9 


1185 


8 


AF004210 
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Zea mays 
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mays CYP7 
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27 
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8 


AC074025 
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Arabidops 




28 
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11 


8 
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2 
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AP003990 


Oryza sat 


c 


29 


50.8 


11 


8 
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2 


AP004000 


AP004000 


Oryza sat 
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2 


AP003571 


AP003571 


Oryza sat 
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11 
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8 
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11 


4 
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8 
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33 
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8 
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Arabidops 
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1 
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8 
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c 
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AP004022 
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Oryza sat 
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8 
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8 
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AC068924 Oryza sat 
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8 
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ALIGNMENTS 



RESULT 1 

AC084282 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE . 
JOURNAL 

REFERENCE 
AUTHORS. 
TITLE 
JOURNAL 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



AC084282 128017 bp DNA PLN 19-JUN-2001 

Oryza sativa chromosome 3. BAC OSJNBb0048A17 genomic sequence, 
complete sequence. 
AC084282 

AC084282.6 GI:14389338 
HTG. 

Oryza sativa. 
Oryza sativa 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 

1 (bases 1 to 128017) 

Buell,C.R., Yuan,Q., Ouyang,S., Moffat, K.S., Hill, J.N. , 
Gansberger , K. , Brenner, M., Burgess, S., Hance,M., Shvartsbeyn, M . , 
Tsitrin,T., Riggs,F., Hsiao, J., Zismann,V., Blunt, S., Pai,G., 
VanAken,S.E. , Utterback, T . R . , Feldblyum, T . V . , Quackenbush, J . , 
Salzberg, S . L . , White, 0. and Fraser,C.M. 

Oryza sativa chromosome 3 BAC OS JNBb0048A17 genomic sequence 
Unpublished 

2 (bases 1 to 128017) 
Buell, R. 

Direct Submission 

Submitted (20-OCT-2000) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA 

3 (bases 1 to 128017) 
Buell, R. 

Direct Submission 

Submitted ( 13- JUN-2001 ) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA 

4 (bases 1 to 128017) 
Buell, R. 

Direct Submission 

Submitted ( 1 9- JUN-2001 ) The Institute for Genomic Research, 9712 
Medical Center Dr, Rockville, MD 20850, USA, rbuell0tigr.org 
On Jun 13, 2001 this sequence version replaced gi: 12039441. 
Address all correspondence to : rice@t igr . org 



BAC clone OS JNBb004 8A17 is from Oryza sativa chromosome 3 

The orientation of the sequence is from SP6 to T7 end of the BAC 

clone . 

Genes were identified by a combination of several methods: Gene 
prediction programs including Fgenesh (http://www.softberry.com/), 
genscan and Genscan+ (Chris Burge, 



Matches 191; Conservative 0; Mismatches 139; Indels 0; Gaps 

Qy 61 agcaagagctctggatggtcattagcatgtcctctgttgcggtcgtgaagttcttcctca 120 

I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I 
Db 5 .AACAACTAATATGGTTGTACTCTATCATGATATTTGCAACTGTGGTGAAGCTTATACTCT 64 

Qy 121 tgctctactgccgaacgttcaagaatgagatcgtgagggcctacgcccaggaccatttct 180 

I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
Db 65 GGCTTTACTGCAGAAGCTCGAGAAACAAGATTGTCCGTGCCTATGCAGATGATCACCACT 124 

Qy 181 tcgacgtaatcacaaactctgtcggcctggtctcggcgctgctcgctgtccggtacaaat 240 

I I I I I I I I I I I I I I I III I I I I I I I I I I I II 

Db 125 TTGATGTGGTAACAAATGTAGTTGGATTAGTTGCGGCTATTCTTGGTGATAAATTTTACT 184 

Qy 241 ggtggatggaccctgttggcgccatactgatcgcgttgtacacgatcacgacgtgggcgc 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 185 GGTGGATTGATCCGATAGGAGCTATTTTGCTTGCAATTTACACCATCTCAAATTGGTCTC 24 4 

Qy 301 gaacggtgctggagaacgtaggcacactgataggcaagtcggcgccggcagagtacctga 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II 
Db 24 5 GCACTGTCATGGAGAATGCCGTTTCATTGGTGGGACAATCTGCACCTCCTGAAGTTTTGC 304 

' Qy 361 cgaagctcacgtacttgatctggaaccacc 390 

I I I I I I I I I I I I I I I I I 
Db 305 AGAAGCTAACATATCTCGTTATAAGGCACC 334 

Search completed: February 7, 2002) 08:21:05 
Job time: 18142 sec 



http: //CCR-081 .mit . edu/GENSCAN . html) , GeneMarkHMM (Mark Borodovsky, 
http: //genemark. biology. gat ech.edu/GeneMark/) , and GeneSplicer 
(Mihaela Pertea and Steven Salzberg, contact mpertea@tigr.org), 
searches of the complete sequence against a peptide database and 
the plant EST database at TIGR (http://www.tigr.org/tdb/tgi.shtml). 
Annotated genes are named to indicate the level of evidence for 
their annotation. Genes with similarity to other proteins are named 
after the database hits. Genes without significant peptide 
similarity but with EST similarity are named as unknown proteins. 
Genes without protein or EST similarity, that are predicted by more 
than two gene prediction programs over most of their length are 
annotated as hypothetical proteins. Genes encoding tRNAs are 
predicted by tRNAscan-SE (Sean Eddy, 

http://genome.wustl.edu/eddy/tRNAscan-SE/). Simple repeats* are 
identified by repeatmasker (Arian Smit, 

http : //ftp . genome . Washington . edu/RM/RepeatMasker . html ) . 
FEATURES Location/Qualifiers 
source 1. .128017 

/organism="Oryza sativa" 

/cultivar="Nipponbare" 

/sub_species=" japonica" 

/db_xref-"taxon:4530" 

/ chromosome= " 3 " 

/map="R2 404" 

/clone="OSJNBb0048A17" 
repeat_region complement (4411. .4455) 

/ r p t_f ami 1 y= " AT_r i ch " 
mRNA join(5116. .5166,5274. .5423,6406. .6528,6614. .6670, 

6955. .7029,7121. .7523) 

/gene="OSJNBb0048A17 . 2" 
gene 5116. .7523 

/gene="OSJNBb0048A17.2" . 

/note="nearly identical to translation initiation factor 
5A GB:CAB96075 GI:8919176 (Oryza sativa); EST AU057661, 
AU108424 from this gene" 
CDS join(5298. .5423,6406. .6528,6614. .6670,6955. .7029, 

7121. .7225) 
/gene="OSJNBb0048A17 .2" 
/codon_start=l 

/product="translation initiation factor 5A" 
/protein_id="AAK63944 . 1" 
/db_xref="GI : 14488377" 

/translation="MSDSEEHHFESKADAGASKTYPQQAGTIRKNGHIVIKNRPCKVV 
EVSTSKTGKHGHAKCHFVAIDIFNGKKLEDIVPSSHNCDVPHVNRTDYQLIDISEDGF 
VSLLTESGGTKDDLRLPSDEALLTQIKDGFAEGKDLIVTVMSAMGEEQICALKDIGPK 
N" 

repeat_region complement (7835 . .7913) 
/rpt_family="AT_rich" 

repeat_region complement (7915 . .8003) 

/rpt_family="Gaigin_012 MITE element from gb:U72728 Oryza 
longistaminata receptor-like kinase protein (Xa21) , family 
member F, pseudogene sequence (233 to 384) 152 nt" 

repeat_region complement (7921 . .8041) 

/rpt_f amily="Gai j in_0s3 element from gb:D32165 Rice gene 
for aspartic protease (302 to 448) 147 nt" 

mRNA join(<8181. .8441,8581. .8675,8890. .9007,9093. .9211, 

9300. .9486,9573. .9719,10135. .10681) 



i 



gene 



CDS 



repeat_region 
repeat__region 
repeat_region 
mRNA 

gene 



CDS 
15452) 



repeat_region 
repeat_region 
repeat_region 
repeat_region 
repeat_region 



/gene="OSJNBb0048A17 . 3" 

8181. .10681 

/gene="OSJNBb0048A17 . 3" 

/note="EST AU029806 from this gene" 

join(8181. .8441,8581. .8675,8890. .9007,9093. .9211, 

9300. .9486,9573. .9719,10135. .10425) 

/gene="OSJNBb0048A17 .3" - • 

/codon_start=l 

/product="unknown protein" 

/protein_id="AAK63921 . 1" 

/db_xref="GI: 14488354" 

/translation="MAEALVAVLRLAASAAATARPQSRSGRHGSCAARVPCPGPSPFR 

RGRLCARAAVAGPPEVDDDDAMT I DNLRRFFDVNVGKWNGAFYQFDAHGRVLQGI STR 

LSVSTYGEDDLISLLQSLYIKQASSQISFVDEEDSEEWVEYKIKETNMFTVDKYQQVG 

FFQEEKAFALRYQTAGMLETVLRAGVLGEDDTGEESPKNLKIPSRKPSIVCENCLYSR 

EGNGRVRAFHIMDPKGVLDMLIIFHEKQGSEVPLMYSSDDADITNSDRIAPLLGRWEG 

RSVTKRSGVYGATLSEADTVVLLEKDRNGQLILDNMSTKSGSSTTTTVHWTGSANNNL 

LQFDGGYEMTLLPGGMYMGYPTDIGKIVNDMDSFHLEFCWMESPGKRQRLVRTYDSAG 

LAVSSTYFFETKV" 

complement (8326. .8355) 

/rpt_family="GC_rich ,f 

complement (8464 . .8503) 

/rpt_f amily=" (GAA) n" 

12217. .12281 

/rpt_f amily=" (CGG) n" 

join(<12233. .12508,12919. .13577,13715. .13858, 
15212. .>15452) 
/gene="OSJNBb0048A17 . 12" 
12233. .15452 
/gene-"OSJNBb0048A17 . 12" 

/note="similar to DNA binding protein GB:CAA88326 
GI: 1159877 (Avena fatua) " 

join(12233. .12508,12 919. .13577,13715. .13858,15212. 

/gene="OSJNBb0048A17 . 12" 
/codon_start=l 

/product="putative DNA-binding protein" 
/protein_id-"AAK63923. 1" 
/db_xref="GI: 14488356" 

/trans lation="MADRRRSDGGGGMQQQPFTSPGQERVFDGGGVPGQVAAPYGSDF 

DQSSYMALLAAGAVGVGVGVQPTAAPWAVEEDVAAAPPGISLAPQFSMANYAPPPSYQ 

HPATLVSPPLAAGLHPYPPYLHGVDAPPPQWPPRPAPPPSFSVLDL7UVAAAPHEQRHS 

MQQLLLRAAAFGGGMHAAAAPAPAAAAAIEQPAKDGYNWRKYGQKQLKDAESPRSYYK 

CTRDGCPVKKIVERSSDGCIKEITYKGRHSHPRPVEPRRGGAASSSSSAMAAGTDHNA 

GAAADDAAAADEDDPSDDDDTLLHEDDDDGEEGHDRGVDGEVGQRVVRKPKIILQTRS 

EVDLLDDGYRWRKYGQKVVKGNPRPRSYYKCTADGCNVRKQIERASADPKCVLTTYTG 

RHNHDPPGRPPAAANLQMPGPAAMRLAGGGTAHQQPSGGAHQMKEET" 

complement (12 929. .13100) 

/rpt_f amily=" (CGG) n" 

13144. .13205 

/rpt_f amily=" (CGG) n" 

complement (13404 . .13425) 

/ rp t _f ami 1 y = " GC_r i ch " 

13463. .13565 

/rpt_f amily-" (CGA) n" 

complement (14 525 . .14 573) 

/rpt_f amily="AT_rich" 



I 



repeat_region 
repeat_region 
mRNA 

gene 
CDS 



repeat_region 

repeat_region 

repeat_reg'ion 

mRNA 

gene 

CDS 



, .16514,16642, 
17523,17675. , 



.16836, 
18034, 



complement ( 15595 . .15615) 
/rpt_f amily=" (CAA) n" 
complement ( 16115 . .16162) 
/rpt_f amily=" (GA) n" 

complement (join(<16194 .. .16307,16392. .16514,16642. .16836, 
16923. .17048,17142. .17304,17388. .17523,17 675. .18034, 
18144. .18242,18340. .>18607)) 
/gene="OSJNBb0048A17 . 13" 
complement (16194 . . 18607) 
/gene= n OSJNBb0048A17 . 13" 
/note="predicted by fgenesh" 
complement (join (16194 . . 16307 , 16392 , 
16923. .17048,17142. .17 304,17388. , 
18144. .18242,18340. .18607)) 
/gene="OSJNBb0048A17 . 13" 
/codon_start=l 

/product= "hypothetical protein" 
/protein_id="AAK63926. 1" 
/db_xref="GI : 14488359" 

/translation-"MGSCVSTTRRRRRSRKLSVAARKFRRKVSAAIADAPIARSGGGG 

GAGGEVAAANCFARHEVVHVEAPVSNVTLHLTQLQWQHSQMDAGSVICEEAWYDSVSI 

LDSADSEDDDLDNDFASVSGDPLPDVTATATSTSTSLLDAVHRLRSIASAEACQDDDP 

PGKAEESNAAAAADECCSSSGGGLKESAASSTRPPFPPSIPSNKIQPMPIVSVSPHSQ 

KKKSAVVRLSFRRRSYEGDEMTEMSGSTNYLYRPRAGSSLPCSTGEKLSDGCWSAIEP 

SVFRVRGESFFKDKRKSPAPNCSPYIPIGADMFACTRKINHIAQHLALPSLKAHETFP 

SLLIVNIQMPTYPATVFGENDGDGISLVLYFKLSDSFDKEISPQLKESIKKLMGDEME 

RVKGFPVDSNVPY.TERLKILAGLVNPDDLQLSAAERKLVQTYNQKPVLSRPQHKFFKG 

PNYFEIDLDVHRFSFISRKGLEAFRERLKHGVLDLGLTIQAQKAEELPEHVLCCMRLN 

KIDFADSGQIPTLIMSSDE" 

complement (1844 9. .184 88) 

/rpt_f amily=" (CGG) n" 

18751. .18773 

/rpt_f amily=" (CAAT) n" 

20529. .20560 

/rpt_f amily=" (GGAGAA) n" 

join(<20577. .20761,21406. .21522,21610. .>22228) 

/gene="OSJNBb0048A17 .25" 

20577 . .22228 . 

/gene="OSJNBb0048A17 .25" 

/note="predicted by genemarkHMM" 

join(20577. .20761,21406. .21522,21610. .22228) 

/gene="OSJNBb0048A17 .25" 

/codon_start=l 

/product="hypothetical protein" 



Query Match 32.3%; 
Best Local Similarity 71.4%; 
Matches 182; Conservative 



Score 139.4; DB 8; 
Pred. No. 6.4e-25; 
0; Mismatches 73; 



Length 128017; 
Indels 0; Gaps 
100 



0; 



Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 

I I I I I I I I I I I III I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 55653 GGCGGCGGGTGCGACGGCAACCTCTCGATGCCGTTCGGGATGGGGAGGCGGAGGTGCCCC 55712 



Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 55713 GGCGAGACGCTGGCTCTGCACACGGTGGGGCTGGTGCTGGGCACGCTGATCCAGTGCTTC 55772 



Qy - 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 55773 GACTGGGAGAGGGTCGATGGCGTGGAGGTCGACATGGCTGAGGGTGGCGGGCTCACCATG 55832 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III II III I I I II I I I I I I I I I I I I I I I I I I I II III I I I I I I I I I I I 
Db 55833 CCCAAGGTCGTGCCGTTGGAGGCCGTGTGCAGGCCGCGCGACGCCATGGGTGGTGTTCTT 558 92 

Qy 281 aagaggctctgaaaa 295 

I I I I I I I I I 
Db 55893 CGCGAGCTCTGAACA 55907 
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LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



AF321856 1775 bp mRNA PLN . 18-APR-2001 

Lolium rigidum clone FHH-t putative cytochrome P4 50 mRNA, complete 
cds . 

AF321856 

AF321856.1 GI:13661745 

Lolium rigidum. 
Lolium rigidum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 
Pooideae; Poeae; Lolium. 

1 (bases 1 to 1775) 

Fischer, T. C. , Klattig,J.T. and Gierl,A. 

A general cloning strategy of divergent plant cytochrome P450 genes 
and its application in Lolium rigidum and Ocimum basilicum 
Unpublished 

2 (bases 1 to 1775) 

Fischer, T. C. , Klattig, J. T . and Gierl,A. 
Direct Submission 

Submitted ( 1 6-NOV-2000 ) Lehrstuhl fuer Zierpf lanzenbau, 
TU-Muenchen, Am Hochanger 4, Freising 85350, Germany 

Location /Qualifiers 

1. .1775 

/organism="Lolium rigidum" 

/isolate="SLR 31" 

/db_xref="taxon: 89674" 

/clone="FHH-t" 

32. .1585 

/codon_start=l 

/product="putative cytochrome P450" 
/protein_id-"AAK38080 . 1" 
/db_xref="GI : 1366174 6" 

/t ranslat ion= lf MDKAYIAILSCAFLFLVHYVLGKVSDGRRGKKGAVQLPPSPPAV 
PFLGHLHLVDKPIHATMCRLAARLGPVFSLRLGSRRAVVVSSSECARECFTEHDVT FA 
NRPKFPSQLLVSFNGTALVTSSYGPHWRNLRRVATVQLLSAHRVACMSGVIAAEVRAM 
ARRLFHATEASPDGAARVQLKRRLFELSLSVLMETIAQTKATRSEADADTDMSVEAQE 
FKEVVDKLIPHLGAANMWDYLPVMRWFDVFGVRNKILHAVSRRDAFLRRLIDAERRRL 
ADGGSDGDKKSMIAVLLTLQKTEPKVYTDTMITALCANLFGAGTETTSTTTEWAMSLL 
LNHPAALKKAQAEIDASVGTSRLVSVDDVPSLAYLQCIVSETLRLYPAAPLLLPHESS 
ADCKVGGYNVPADTMLIVNAYAIHRDPAAWEDPLEFRPERFEDGKAEGLFMIPFGMGR 
RRCPGETLALRTIGMVLATLVQCFDWEPVDGVKVDMTEGGGFTIPKAVPLEAVCRPRA 
VMRDVLQNL" ; 



BASE COUNT 316 a 580 c 558 g 321 t* 

ORIGIN 



Query Match 32.0%; 
Best Local Similarity 69.3%; 
Matches 187; Conservative 



Score 138.4; DB 8; 
Pred. No. 1.5e-24; 
0; Mismatches 83; 



Length 1775; 



Indels 



0; Gaps 



Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1334 GACGGCAAGGCCGAGGGTCTGTTCATGATACCGTTCGGGATGGGGCGGCGGAGGTGCCCC 1393 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I 
Db 1394 GGGGAGACGCTGGCGCTGCGGACGATCGGAATGGTCCTGGCGACGCTGGTGCAGTGCTTC 14 53 



0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 



161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I 

14 54 GACTGGGAACCGGTGGACGGCGTGAAGGTGGACATGACGGAGGGGGGAGGGTTCACCATC 1513 

221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

II I I I I I I I I I I I I I I II I I I II I I I I I I I I I II I I I I I I I I I I I 
1514 CCAAAGGCCGTGCCGTTGGAGGCCGTGTGCAGGCCGCGCGCGGTCATGCGCGACGTGCTT 1573 



281 



310 



aagaggctctgaaaacctcatggatcgaat 
III I I I I I III I I I I I 
1574 CAGAACCTCTAATCAACTAGTACCTTGCAT 1603 



RESULT 3 

AF321857 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



AF321857 1810 bp mRNA PLN 18-APR-2001 

Lolium rigidum clone FHH-y putative cytochrome P450 mRNA, complete 
cds . 

AF321857 

AF321857.1 GI:13661747 

Lolium rigidum. 
Lolium rigidum 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 
Pooideae; Poeae; Lolium. 

1 (bases 1 to 1810) 

Fischer, T. C. , Klattig,J.T. and Gierl,A. 

A general cloning strategy of divergent plant cytochrome P450 genes 
and its application in Lolium rigidum and Ocimum basilicum 
Unpublished 

2 (bases 1 to 1810) 

Fischer, T. C. , Klattig,J.T. and Gierl,A. 
Direct Submission 

Submitted ( 16-NOV-2000) Lehrstuhl fuer Zierpf lanzenbau, 
TU-Muenchen, Am Hochanger 4, Freising 85350, Germany 

Location/Qualifiers 
■ 1.' .1810 

/organism="Lolium rigidum" 

/isolate="SLR 31" 

/db xref="taxon: 89674" 



/clone="FHH-y" 
CDS 38. .1591 

/codon_start=l 

/product="putative cytochrome P4 50" 
/protein_id="AAK38081 .1" 
/db_xref="GI : 13661748" 

/translation="MDKAYIAILSCAFLFLVHYVLGKVSDGRRGKKGAVQLPPSPPAV 
PFLGHLHLVDKPIHATMCRLAARLGPVFSLRLGSRRAVVVSSSECARECFTEHDVTFA 
NRPKFPSQLLVSFNGTALVTSSYGPHWRNLRRVATVQLLSAHRVACMSGVIAAEVRAM 
ARRLFHAAEASPDGAARVQLKRRLFELSLSVLMETIAQTKATRSEADADTDMSVEAQE 
FKEVVDKLIPHLGAANMWDYLPVMRWFDVFGVRNKILHAVSRRDAFLRRLIDAERRRL 
ADGGSDGDKKSMIAVLLTLQKTEPKVYTDTMITALCANLFGAGTETTSTTTEWAMSLL 
LNHPAALKKAQAEIDASVGTSRLVSVDDVPSLAYLQCIVNETLRLYPAAPLLLPHESS 
ADCKVGGYNVPADTMLIVNAYAIHRDPAAWEHPLVFRPERFEDGKAEGLFMIPFGMGR 
RRCPGETLALRTIGMVLATLVQCFDWEPVDGVNVDMTEGGGFTIPKAVPLEAVCRPRA 
VMRDVLQSI" 

BASE COUNT 329 a 578 c 567 g 336 t 

ORIGIN 



Query Match ' 31.5%; Score 136.2; DB 8; Length 1810; 

Best Local Similarity 69.6%; Pred. No. 5.5e-24; 

Matches 183; Conservative 0; Mismatches 80; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I I I II I I Mil l I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 134 0 GACGGCAAGGCCGAGGGTCTGTTCATGATACCGTTCGGGATGGGGCGGCGGAGGTGCCCC 1399 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I 
Db 14 00 GGGGAGACGCTGGCGCTGCGGACGATCGGAATGGTCCTGGCGACGCTGGTGCAGTGCTTC 14 59 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I I I I I I I I I I III I I I I II I I I I I I 

Db 14 60 GACTGGGAACCGGTGGACGGCGTGAATGTGGACATGACGGAGGGGGGAGGGTTCACCATC 1519 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

II I I I I I I I I I I I I I II I II M I I I I I I I I I I II I I I I I I I I II I 

Db 1520 CCAAAGGCCGTGCCGTTGGAGGCCGTGTGCAGGCCGCGCGCGGTCATGCGCGACGTGCTT 1579 

Qy 281 aagaggctctgaaaacctcatgg 303 

I I I I I I I I III II 
Db 1580 CAGAGCATCTAATCAACTAGTAG 1602 



RESULT 4 

AF321855 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



AF321855 1795 bp mRNA PLN 18-APR-2001 

Lolium rigidum clone FHH-v putative cytochrome P4 50 mRNA, complete 
cds . 

AF321855 

AF321855.1 GI:13661743 

Lolium rigidum. 
Lolium rigidum 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



CDS 



BASE COUNT 
ORIGIN 



Pooideae; Poeae; Lolium. 

1 (bases 1 to 1795) 

Fischer, T. C. , Klattig, J. T. andGierl,A. 

A general cloning strategy of divergent plant cytochrome P450 genes 
and its application in Lolium rigidum and Ocimum basilicum 
Unpublished 

2 (bases 1 to 1795) 

Fischer, T.C. , Klattig, J. T. and Gierl,A. 
Direct Submission 

Submitted ( 16-NOV-2000 ) Lehrstuhl fuer Zierpf lanzenbau, 
TU-Muenchen, Am Hochanger 4, Freising 85350, Germany 

Location/Qualifiers 

1. .1795 

/organism="Lolium rigidum" 

/isolate="SLR 31" 

/ db_x ref="taxon:89674" 

/clone="FHH-v" 

32. .1585 

/codon_start=l 

/product = "putative cytochrome P4 50" 
/protein_id="AAK38079.1" 
/db_xref="GI : 136617 4 4 " 

/trans la tion="MDKAYIAILSSAFLFLVHYVLGKVSDGRRGKKGAVQLPPSPPAV 
PFLGHLHLVEKPIHATMCRLAARLGPVFSLRLGSRRAWVSSSECARECFTEHDVTFA 
NRPKFPSQLLVSFNGTALVTSSYGPHWRNLRRVATVQLLSAHRVTCMSGVIAAEVRAM 
ARRLFHAAEASPDGAARVQLKRRLFELSLSVLMETIAQTKATRSEADADTDMSLEAQE 
FKE VVDKL I PHLGAANMWDYLPVMRWFDVFGVRS KI LHAVSRRDAFLRRL INAERRRL 
ADGGSDGDKKSMIAVLLTLQKTEPKVYTDTMITALCANLFGAGTETTSTTTEWAMSLL 
LNHPAALKKAQAEIDASVGTSRLVSVDDVPSLAYLQCIVSETLRLYPAAPLLLPHESS 
ADCKVGGYNVPADTMLIVNAYAIHRDPAAWEDPLEFKPERFEDGKAEGLFMIPFGMGR 
RRCPGETLALRTIGMVLATLVQCFDWEPVDGVKVDMTEGGGFTIPKAVPLEAVCRPRV 
VMRDVLQNL" 
325 a 582 c 559 g 329 t 



Query Match 30.8%; 
Best Local Similarity 69.8%; 
Matches 178; Conservative 



Score 133; DB 8; Length 1795; 
Pred. No. 3.5e-23; 
0; Mismatches 77; Indels 0; 



Gaps 



0; 



Qy 4 4 ggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccccggg 103 

M I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1337 GGCAAGGCCGAGGGGCTGTTCATGATACCGTTCGGGATGGGGCGGCGGAGGTGCCCCGGA 1396 



Qy 

Db 

Qy 
Db 

Qy 
Db 



104 



1397 



164 



gaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgac 163 
I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GAGACGCTAGCACTACGGACGATCGGCATGGTCCTGGCGACGCTGGTGCAGTGCTTCGAC 14 56 



tgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatgccc 223 
I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I 

14 57 TGGGAACCGGTGGACGGCGTGAAGGTGGACATGACAGAGGGGGGAGGGTTCACCATCCCA 1516 



224 



283 



cgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttcttaag 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1517 AAGGCCGTGCCGTTGGAGGCCGTGTGCAGGCCGCGCGTGGTCATGCGCGACGTGCTTCAG 157 6 



Qy 



284 aggctctgaaaacct 298 



Db 1577 AACCTCTAATCATCT 1591 



RESULT 5 
AF140486 

LOCUS "AF140486 433 bp mRNA PLN ll-MAY-1999 

DEFINITION Oryza sativa cytochrome P450 mRNA, partial cds . 

ACCESSION AF140486 

VERSION . AF140486.1 GI:4768971 

KEYWORDS 

SOURCE Oryza sativa. 

ORGANISM Oryza sativa 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta ; Tracheophyta; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 
REFERENCE 1 (bases 1 to 433) 
AUTHORS Liu, J. and Yang, J. 

TITLE Suppression subtractive hybridization (SSH) identified candidate 

genes that are differentially expressed at rice young panicle 
JOURNAL Unpublished 
REFERENCE 2 (bases 1 to 433) 
AUTHORS Liu, J. and Yang, J. S. 
TITLE Direct Submission 

JOURNAL Submitted ( 05-APR-l 999 ) Genetics, Institute of Genetics, No. 220 
Handan Road, Shanghai 200433, China 
FEATURES Location/Qualifiers 
source 1. .433 

/organism="Oryza sativa" 
/db_xref="taxon: 4530" 
/tissue_type= "panicle" 
CDS <1. .320 

/codon_start=3 
/product="cytochrome P450" 
/protein_id="AAD29699 . 1" 
/db_xref-"GI : 4768972" 

/trans la t,ion="SMQRDPRVWEDPDKFIPERFKGFKVDRSGWMMPFGMGRRKCPGE 
GLALRTVGMALGVMIQCFQWERLGKKKVDMSEGSGLTMPTAVPLMAMCLPRVEMESVL 
KSL" 

BASE COUNT 116 a 83 c 123 g . Ill t 

ORIGIN 



Query Match 18.8%; Score 81.2; DB 8; Length 433; 

Best Local Similarity 59.1%; Pred. No. 4e-10; 

Matches 137; Conservative 0; Mismatches 95; "indels 0; Gaps 0; 

Qy 64 gctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcggac 123 

I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I MM 
Db 92 GATGATGCCCTTCGGTATGGGGAGGCGGAAGTGCCCCGGTGAAGGCCTTGCTCTTAGGAC 151 

Qy 124 cgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatggagc 183 

II I I I II I M I I Ml I I I I II I M I II III 

Db 152 GGTGGGGATGGCGCTAGGGGTTATGATACAATGCTTTCAGTGGGAGCGGCTCGGAAAGAA 211 

Qy 184 tcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttggaggc 243 
II I I I I I I I I I I II II I I I I I M M II I III 



Db 



212 GAAGGTTGATATGAGTGAAGGTTCTGGGCTCACCATGCCT ACGGCCGTGCCTCTCATGGC 271 



Qy 244 catgtgcangccgcgtacagctatgcgtggtgttcttaagaggctctgaaaa 295 

I II I I I I I I I I I I III I I I I I I I I I I I I I III 

Db 272 CATGTGCCTACCACGTGTGGAGATGGAGTCTGTGCTCAAAAGTCTCTAGAAA 323 



RESULT 6 
AF082028 

LOCUS AF082028 1373 bp mRNA PLN 15-JUL-1999 

DEFINITION Hemerocallis hybrid cultivar senescence-associated protein 3 (SA3) 

mRNA, partial cds . 
ACCESSION AF082028 
VERSION AF082028.1 GI:3551949 

KEYWORDS 

SOURCE Hemerocallis hybrid cultivar. 

ORGANISM Hemerocallis hybrid cultivar 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta ; Tracheophyta; 
Spermatophyta; Magnoliophyta ; Liliopsida; Asparagales ; 
Heme r oca llidaceae; Hemerocallis . 
REFERENCE 1 (bases 1 to 1373) 

AUTHORS Panavas,T., Pikula,A., Reid,P.D., Rubinstein, B . and Walker, E.L. 
TITLE Identification of senescence-associated genes from daylily petals 

JOURNAL Plant Mol . Biol. 40 (2), 237-248 (1999) 
MEDLINE 99339248 
REFERENCE 2 (bases 1 to 1373) 

AUTHORS Panavas,T., Pikula,A., Reid,P.D., Rubinstein, B . and Walker , E . L . 
TITLE Direct Submission 

JOURNAL Submitted ( 04-AUG-l 998 ) Biology, University of Massachusetts, 
Morrill Science Center, Amherst, MA 01003, USA 
FEATURES # Location/Qualifiers 
source 1. .1373 

/organism="Hemerocallis hybrid cultivar" 
/cultivar="Stella d'Oro" 
/db_xref="taxon: 80862" 
/tissue_type="senescing petals" 
gene <1. .1373 

/gene="SA3" 
CDS <1. .1121 

/gene="SA3" 

/f unction="putative cyt P4 50-containing fatty acid 
hydroxylase" 

/note-"mRNA accumulates in senescing petals" 
/codon_start=3 

/product="senescence-associated protein 3" 
/protein_id="AAC34853. 1" 
/db_xref="GI : 3551950" 

/trans la tion="STEIFSPVRIRSLAAVRQEEVKLMITGILASTSTDNSVKVNMKV 
VFSELMFNVIMKIIAGKRYFGVNTDSEVEEGQKFRVVFDEMFSTLEVASPQDFLPFLK 
WFGFKRMENRLTKLAKELDQLFQKLI EERRSERGKVQST VI DVLLSLQETDREQYS DK 
LIKGMILSLIAAGTHTTAGTMEWAMSLLLNHPEALLKVRDEIDKKVGQDRLIDHSDLQ 
NLSYLNNAIKESLRLFPTAPLLLAHESSAECTVGGFTIPSNTILFANAYALHRDPKVW 
TDPVSFKPERFENNGQQGSRIYVPFGLGRRSCPGEGLATQVVGLALGTLIQCFEWDRN 
GEEKVDMTDGSGLAMHMEKPLEAMCKPRQSIVDVINRL " 

BASE COUNT- 440 a 253 c 329 g 351 t 

ORIGIN 



Query Match 16.0%; Score 69; DB 8; Length 1373; 

Best Local Similarity 55.8%; Pred. No. 4.3e-07; 

Matches 129; Conservative 0; . Mismatches 102; Indels 0; Gaps 0; 

Qy 69 tccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcg 128 

I I I I I I I I I I I I II I I I I I I I I I I I I I 1 I I I I II III 
Db 898 TGCCATTCGGGTTGGGGAGGCGGAGCTGTCCAGGTGAAGGGCTAGCAACGCAAGTTGTGG 957 

Qy 129 ggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatggagctcagg 188 

I III II II III I I I I I I I I I I I I I I I I I I I I I III 

Db 958 GTTTGGCTTTGGGGACATTGATTCAATGCTTCGAGTGGGACCGAAATGGTGAAGAGAAGG 1017 

Qy 18 9 tttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttggaggccatgt 24 8 

I I I I I I I MINI II II I I I I II I I I I 

Db 1018 TGGACATGACTGACGGATCAGGGCTCGCCATGCATATGGAAAAGCCTCTAGAGGCTATGT 1077 

Qy 249 gcangccgcgtacagctatgcgtggtgttcttaagaggctctgaaaacctc 299 

III I I I I I III I I I I I I I I I I I I I I III II 
Db 1078 GCAAACCTCGCCAAAGTATTGTTGATGTCATCAATAGGCTTTAGAAATTTC 1128 



RESULT 7 
AP003711/c 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



AP003711 165909 bp DNA HTG 31-MAY-2001 

Oryza sativa chromosome 6 clone P0417G12, *** SEQUENCING IN 
PROGRESS ***, in ordered pieces. 
AP003711 

AP003711.1 GI:14270111 
HTG; HTGS_PHASE2 . 

Oryza sativa (cultivar: Nipponbare) DNA, clone : P0417G12 . 
Oryza sativa 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; 
Ehrhartoideae; Oryzeae; Oryza. 

1 (sites) 

Sasaki, T., Matsumoto,T. and Yamamoto,K. 

Oryza sativa nipponbare (GA3) genomic DNA, chromosome 6, PAC 
clone:P0417G12 

Published Only in DataBase (2001) In press 

2 (bases 1 to 165909) 

Sasaki, T., Matsumoto,T. and Yamamoto,K. 
Direct Submission 

Submitted ( 30-MAY-2001 ) Takuji Sasaki, National Institute of 
Agrobiological Resources, Rice Genome Research Program; Kannondai 
2-1-2, Tsukuba, Ibaraki 305-8602, Japan 

(E-mail : tsasaki@abr . af f rc . go . jp, URL: http : //rgp . dna . af f rc . go . jp/, 
Tel: 81-298-38-7441, Fax:81-298-38-74 68) 

NOTE: It currently consists of 1 contigs. Gaps between the contigs ' 
are represented as runs of N. The order of the pieces is believed 
to be correct as given, however the sizes of the gaps between them 
are based on estimates that have provided by the submitter. This 
sequence will be replaced by the finished sequence as soon as it is 
available and the accession number will be preserved. 

* NOTE: This is a 'working draft 1 sequence. 

* This sequence will be replaced 

* by the finished sequence as soon as it is available and 



* the accession number will be preserved. 
FEATURES Location/Qualifiers 
source 1. .165909 

/organism= M Oryza sativa" 

/ cultivar="Nipponbare" 

/db_xref="taxon: 4530" 

/chromosome=" 6 " 

/clone="P0417G12 n 
BASE COUNT 47417 a 34811 c 35629 g 47902 t 150 others 
ORIGIN 



Query Match 15.9%; Score 68.8; DB 2; Length 165909; 

Best Local Similarity 58.4%; Pred. No. 3.4e-07; 

Matches 118; Conservative 0; Mismatches 84; Indels 0; Gaps 0; 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 84463 CCACCTGCTGCCGTTCGGGTCGGGGCGGCGGATCTGCCCCGGCGCGTCGCTGGCGATGCT 84404 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgg'gacacggttgatgg 180 

I II I I I I I I I I I III I I I I M II I I I I I I I I I II 
Db 84403 GGTGGTGCAGGCGGCGCTGGCCGCCATGGTGCAGTGCTTCGAGTGGAGCCCCGTCGGCGG 84 344 

Qy 181 agctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttgga 240 

I I I I I I I I I I I I I I I I I I till III I I I I I I 

Db 84 34 3 CGCGCCGGTGGACATGGAGGAGGGGCCCGGGCTGACGCTGCCGCGGAAGCGCCCGCTCGT 84 284 

Qy 241 ggccatgtgcangccgcgtaca 262 

II I I I II I I I I 
Db 84 283 CTGCACCGTCTCGCCGCGGATA 842 62 



RESULT 8 

D78607 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 



D78607 1652 bp mRNA PLN 09-JUN-1998 

Arabidopsis thaliana mRNA for cytochrome P450 monooxygenase, 
complete cds, clone P450-66-8. 
D78607 

D78607.1 GI:3164143 



Arabidopsis thaliana ( 
clone_,lib: lambda ZAP I 
Arabidopsis thaliana 
Eukaryota; Viridiplant 
Spermatophyta ; Magnoli 
Rosidae; eurosids II; 

1 (bases 1 to 1652) 
Mizutani, M. 

Direct Submission 
Submitted (07-DEC-1995 
Masaharu Mizutani, Int 
(Japan) , Bio-organics 
Hyogo 665, Japan (E-ma 
Tel:0797-74-2464, Fax: 

2 (sites) 

Mizutani, M., Ward,E. a 



strain : Columbia) 7-d seedlings cDNA to mRNA, 
I clone:P450-66-8. 

ae; Streptophyta; Embryophyta; Tracheophyta; 
ophyta; eudicotyledons ; core eudicots; 
Brassicales; Brassicaceae ; Arabidopsis. 



) to the DDBJ/EMBL/GenBank databases, 
ernational Research Laboratories, Ciba-Geigy 
Department; 10-66 Miyuki-cho, Takarazuka, 
il : masaharu . mizutani @ jpt a .mhs . ciba . com, 
0797-74-2455) 

nd Ohta, D.. 



TITLE 



JOURNAL 
MEDLINE 
FEATURES 

source 



gene 



CDS 



polyA_signal 
BASE COUNT 502 
ORIGIN 



Cytochrome P450 superf amily in Arabidopsis thaliana: isolation of 
cDNAs, differential expression, and RFLP mapping of multiple 
cytochromes P450 

Plant Mol. Biol. 37 (1), 39-52 (1998) 
98281573 

Location/Qualifiers 
1. .1652 

/organism= "Arabidopsis thaliana" 

/strain=" Columbia" 

/db_xref ="taxon : 3702 " 

/clone="P450-66-8" 

/clone_lib="lambda ZAP II" 

/tissue_type="7-d seedlings" 

51. .1553 

/gene="CYP91A2" 

51. .1553 

/gene="CYP91A2" 

/codon_start=l 

/product="cytochrome P450 monooxygenase" 
/protein_id="BAA28539. 1" 
/db_xref-"GI : 3164144" 

/trans la tion="MLYFILLPLLFLVISYKFLYSKTQRFNLPPGPPSRPFVGHLHLM 
KPPIHRLLQRYSNQYGPIFSLRFGSRRVVVITSPSLAQESFTGQNDIVLSSRPLQLTA 
KYVVYNHTTVGTAPYGDHWRNLRRMCSQEILSSHRLIIFQHIRKDEILRMLTRLSRYT 
QTSNESNDFTHIELEPLLSDLTFNNIVRMVTGKRYYGDDVNNKEEAELFKKLVYDIAM 
YSGANHSADYLPILKLFGNKFEKEVKAIGKSMDDILQRLLDECRRDKEGNTMVNHLIS 
LQQQQPEYYTDVIIKGLMMSMMLAGTETSAVTLEWAMANLLRNPEVLEKARSEIDEKI 
GKDRLIDESDIAVLPYLQNVVSETFRLFPVAPFLIPRSPTDDMKIGGYDVPRDTIVMV 
NAWAIHRDPEIWEEPEKFNPDRYNDGCGSDYYVYKLMPFGNGRRTCPGAGLGQRIVTL 
ALGTLIQCFEWENVKGEEMDMSESTGLGMRKMDPLRAMCRPRPIMSKLLL" 
1630. .1635 
a 406 c 348 g 396 t 



Query Match 14.0%; Score 60.6; DB 8; 

Best Local Similarity 53.2%; ■ Pred. No. 5.5e-05; 
Matches 126; Conservative 0; Mismatches 111; 



Length 1652 ; 
Indels 0; 



Gaps 



0; 



Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I 1 III III I I I I I I I I I I I I I II I I I I I 
Db 12 99 GACGGATGCGGAAGCGATTACTATGTTTACAAGCTGATGCCGTTTGGGAATGGCCGGAGA 135£ 



Qy 

Db 

Qy 
Db 

Qy 
Db 



92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I I I I I I I I I II I I I I I II III Ml I 

1359 ACTTGTCCCGGCGCCGGATTAGGTCAGAGGATTGTGACATTGGCGCTTGGAACGTTGATT 1418 

152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 
I I I I I I I I I I I I I I II I I I I III I I III 

1419 CAATGCTTTGAATGGGAGAATGTGAAAGGGGAAGAGATGGATATGTCTGAGAGTACTGGG 14 78 



212 



268 



ctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatg 
II MM I I MM I II I I I I I I I I M I I MM 
14 7 9 TTGGGTATGCGTAAGATGGATCCTTTACGGGCCATGTGTAGGCCTAGGCCCATTATG 1535 



RESULT 9 



AR074108 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 

BASE COUNT 
ORIGIN 



AR074108 622 bp DNA 

Sequence 17 from patent US 5952486. 
AR074108 

AR074108.1 GI:10000868 



Unknown . 
Unknown . 
Unclassified. 
1 (bases 1 to 
Bloksberg, L . N . , 



PAT 



28-AUG-2000 



622) 

Havukkala,!. and Grierson, A. W . 



Materials and methods for the modification of plant lignin content 
Patent: US 5952486-A 17 14-SEP-1999; 
Location/Qualifiers 



170 



1. .622 

/organism= 11 unknown" 
a 117 c 178 g 



157 t 



Query Match 13.8%; 
Best Local Similarity 54.1%; 
Matches 119; Conservative 



Score 59. 6; DB 6; 
Pred. No. 0.00011; 
0; Mismatches 101; 



Length 622; 



Indels 



0; Gaps 



0; 



Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

II II I I I I I I I I.I M I I I I I I I I I I I I I I I I I I I 

Db 180 CCGACTATTGCCGTTTGGGATGGGGAGGAGAAGTTGTCCTGGTGCTGGCCTTGCCAATAG 239 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatgg 180 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 240 AGTGGTGAGCTTGGTCCTGGCGG'CGCTTATTCAGTGCTTCGAATGGGAACGAGTTGGCGA 2 99 

Qy 181 agctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttgga 24 0 

II III I I I I I I I I I I I I I I I I I I I I I 

Db 300 AGAATTGGTGGACTTGTCCGAGGGGACGGGACTCACAATGCCAAAGAGAGAGCCATTGGA 359 

Qy 241 ggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

I I I I I I I I I I I I I I I I III II III 
Db 360 GGCCTTGTGCAAAGCGCGTGAATGCATGATAGCTAATGTT 399 



RESULT 10 
ATF6G17 / c 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



ATF6G17 101009 bp DNA PLN 03-MAR-1999 

Arabidopsis thaliana DNA chromosome 4, BAC clone F6G17 (ESSA 
project) . 
AL035601 

AL035601.1 GI:4468801 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae; Arabidopsis. 
1 (bases 1 to 101009) 

Bevan,M., Koetter,P., Hempel,S., Entian, K . -D . , Bancroft, I., 
Mewes,H.W., Mayer, K.F.X. and Schueller,C. 



JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



misc feature 



gene 



gene 



CDS 



exon 



intron 



exon 



intron 



exon 



Unpublished 

2 (bases 1 to 101009) 

EU Arabidopsis sequencing, project . 

Direct Submission 

Submitted (03-MAR-1999) MIPS, at the Max-Planck-Institut fuer 
Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, FRG, E-mail: 
schuelle@mips . biochem.mpg. de, mayer@mips .biochem.mpg.de Project 
Coordinator: Mike Bevan, Molecular Genetics Department, Cambridge 
Laboratory, John Innes Centre, Colney Lane, NR4 7UJ Norwich, UK, 
E-mail michael .bevan@bbsrc .ac.uk 

Information on performance of analysis and a more detailed 
annotation of this entry and other sequences of chromosome 4 can be 
viewed at : http : / /webs vr . mips .biochem.mpg . de/proj /thai/ . 

Location /Qualifiers 

1. .101009 

/organism^ "Arabidopsis t ha liana" 
/variety-"Columbia" 
/db_xref="taxon:3702" 
/chromosome^" 4 " 
1. .18845 

/note="position 1-18845 overlap to EMBL accession Z99707; 
please refer to this entry for analysis and annotation" 
complement (join (1828 6. . 18 900, 19005. . 19391 , 19523 . . 20020) ) 
/gene="F6G17. 10" 
18286. .20020 
/gene="F6G17.10" 

complement (join (1828 6. . 18900 , 19005 . . 1 9391 , 19523 . .20020) ) 
/gene="F6G17.10" 

/note="similarity to cytochrome P450 monooxygenase, 
Arabidopsis thaliana, D78606 

Contains Cytochrome P450 cysteine heme-iron ligand 
signature [FGLGRRACPG] " 
/codon_start=l 

/product="cytochrome p450-like protein" 
/protein_id="CAB38203 . 1" 
/db_xref="GI : 4468802" 

/translation="MEALMLIFTFCFIVLSLIFLIGRIKRKLNLPPSPAWALPVIGHL 
RLLKPPLHRVFLSVSQSLGDAPIISLRLGNRLLFVVSSHSIAEECFTKNDVILANRQT 
TISTKHISYGNSTVVSASYSEHWRNLRRIGALEIFSAHRLNSFSSIRRDEIRRLIGRL 
LRNSSYGFTKVEMKSMFSDLTFNNIIRMLAGKCYYGDGKEDDPEAKRVRTLIAEAMSS 
SGPGNAADYIPILTWITYSETRIKKLAGRLDEFLQGLVDEKREGKEKKENTMVDHLLC 
LQETQPEYYMDRIIKGTMLSLIAGGTDTTAVTLEWALSSLLNNPEVLNKARDEIDRMI 
GVDRLLEESDIPNLPYLQNIVSETLRLYPAAPMLLPHVASKDCKVGGYDMPRGTMLLT 
NAWAIHRDPLLWDDPTSFKPERFEKEGEAKKLMPFGLGRRACPGSGLAQRLVTLSLGS 
LIQCFEWERIGEEEVDMTEGPGLTMPKARPLEAMCRARDFVGKILPDSS" 
complement (18286. . 18900) 
/gene="F6G17.10" 
/number=l 

complement (18901 . . 19004) 
/number=l 

complement (19005. .19391) 
/gene="F6G17 .10" 
/number=2 

complement (19392 . .19522) 
/number =2 

complement (19523 . .20020) 
/gene="F6G17.10" 



gene 
exon 

gene 
CDS 



intron 
exon 

intron 
exon 

gene 
CDS 



.21744, 21842 . .22225,2234 3. 
.21744, 21842 . .22225,22343. 

Glycyrrhiza 
heme-iron ligand 
T41596, N38867" 



22840) ) 
22840) ) 



/number=3 
21133. .22840 
/gene="F6G17.20" 
complement (21133. .21744) 
/gene="F6G17 . 20" 
/number=l 

complement (join (21133 . 
/gene="F6G17.20" 
complement (join (21133 . 
/gene="F6G17.20" 

/note="similarity to cytochrome P450, 
echinata, AB001379 
Contains Cytochrome P450 cysteine 
signature [FGLGRRACPG] 
contains EST gb:AA586064, H76015, 
/codon_start=l 

/product="cytochrome P450-like protein" 
/protein_id="CAB38204 .1" 
/db_xref="GI: 4468803" 

/trans la tion=="METKTLIFSILFVVLSLIYLIGKLKRKPNLPPSPAWSLPVIGHL 

RLLKPPIHRTFLSLSQSLNNAPIFSLRLGNRLVFVNSSHSIAEECFTKNDVVLANRPN 

FILAKHVAYDYTTMIAASYGDHWRNLRRIGSVEIFSNHRLNSFLSIRKDEIRRLVFRL 

SRNFSQEFVKVDMKSMLSDLTFNNILRMVAGKRYYGDGVEDDPEAKRVRQLIADVVAC 

AGAGNAVDYLPVLRLVSDYETRVKKLAGRLDEFLQGLVDEKREAKEKGNTMIDHLLTL 

QESQPDYFTDRIIKGNMLALILAGTDTSAVTLEWALSNVLNHPDVLNKARDEIDRKIG 

LDRLMDESDISNLPYLQNIVSETLRLYPAAPMLLPHVASEDCKVAGYDMPRGTILLTN 

VWAIHRDPQLWDDPMSFKPERFEKEGEAQKLMPFGLGRRACPGSGLAHRLINLTLGSL 

IQCLEWEKIGEEVDMSEGKGVTMPKAKPLEAMCRARPSVVKIFNESV" 

complement (21745. .21841) 

/number=l 

complement (2184 2. .22225) 
/gene="F6G17 .20" 
/number=2 

complement (22226. .22342) 
/number =2 

complement (22343. . 22840) 

/gene="F6G17 .20" 

/number=3 

23202. .25100 

/gene="F6G17.30" 

complement (23202 . . 25100) 

/gene="F6G17.30" 

/note="similarity to various predicted proteins, 
Arabidopsis thaliana 

Contains Cytochrome c family heme-binding site signature 

[CSDCHT] " 

/codon_start=l 

/product="putative protein" 

/protein_id="CAB38205 .1" 

/db_xref-"GI: 4468804" 

/trans la tion= "MASS PLLATSLPQNQLSTTATARFRLPPPEKLAVLIDKSQSVDE 
VLQIHAAILRHNLLLHPRYPVLNLKLHRAYASHGKIRHSLALFHQTIDPDLFLFTAAI 
NTASINGLKDQAFLLYVQLLSSEINPNEFTFSSLLKSCSTKSGKLIHTHVLKFGLGID 
PYVATGLVDVYAKGGDVVSAQKVFDRMPERSLVSSTAMITCYAKQGNVEAARALFDSM 
CERDIVSWNVMIDGYAQHGFPNDALMLFQKLLAEGKPKPDEITVVAALSACSQIGALE 
TGRWIHVFVKSSRIRLNVKVCTGLIDMYSKCGSLEEAVLVFNDTPRKDIVAWNAMIAG 
YAMHGYSQDALRLFNEMQGITGLQPTDITFIGTLQACAHAGLVNEGIRIFESMGQEYG 



IKPKIEHYGCLVSLLGRAGQLKRAYETIKNMNMDADSVLWSSVLGSCKLHGDFVLGKE 
IAEYLIGLNIKNSGIYVLLSNIYASVGDYEGVAKVRNLMKEKGIVKEPGISTIEIENK 
VHEFRAGDREHSKSKEIYTMLRKISERIKSHGYVPNTNTVLQDLEETEKEQSLQVHSE 
RLAIAYGLISTKPGSPLKIFKNLRVCSDCHTVTKLISKITGRKIVMRDRNRFHHFTDG 
SCSCGDFW" 

complement (23202 . . 25100) 

/gene="F6G17.30" 

/number=l 

complement (23202. .25100) 

/gene="F6G17.30" 

30884. .32930 

/gene="F6G17.40" 

30884. .31206 

/gene="F6G17.40" 

/number=l 

join(30884. .31206,31359. .314 60,31544. .32930) 
/gene="F6G17.40" 

/note="similarity to hypothetical protein, Glycine max, 
PIR2:S17433 

contains EST gb:T44614, AAA712644, AA395496" 
/codon_start=l 

/product="auxin-responsive GH3-like protein" 

/protein_id="CAB38206.1" 

/db_xref="GI: 4468805" 

/ trans la tion="MAVDSPLQSRMVSATTSEKDVKALKFIEEMTRNPDSVQEKVLGE 

ILTRNSNTEYLKRFDLDGVVDRKTFKSKVPVVTYEDLKPEIQRISNGDCSPILSSHPI 

TEFLTSSGTSAGERKLMPTIEEDLDRRQLLYSLLMPVMNLYVPGLDKGKGLYFLFVKS 

ESKTSGGLPARPVLTSYYKSDHFKRRPYDPYNVYTSPNEAILCSDSSQSMYAQMLCGL 

LMRHEVLRLGAVFASGLLRAISFLQNNWKELARDISTGTLSSRIFDPAIKNRMSKILT ■ 

KPDQELAEFLVGVCSQENWEGIITKIWPNTKYLDVIVTGAMAQYIPTLEYYSGGLPMA 

CTMYASSESYFGINLKPMCKPSEVSYTIMPNMAYFEFLPHNHDGDGAAEASLDETSLV 

ELANVEVGKEYELVITTYAGLYRYRVGDILRVTGFHNSAPQFKFIRRKNVLLSVESDK 

TDEAELQKAVENASRLFAEQGTRVIEYTSYAETKTIPGHYVIYWELLGRDQSNALMSE 

EVMAKCCLEMEESLNSVYRQSRVADKSIGPLEIRVVRNGTFEELMDYAISRGASINQY 

KVPRCVSFTPIMELLDSRVVSAHFSPSLPHWSPERRR" 

31207. .31358 

/gene="F6G17.40" 

/ number =1 

31359. .31460 

/gene="F6G17.40" 

/number=2 

31461. .31543 

/gene="F6G17.40" 

/number=2 

31544. .32930 

/gene="F6G17.40" 

/number =3 

35807. .37359 

/gene="F6G17.50" 

join(35807. .36127,36230. .36289, 36724.. .37359) 
/gene="F6G17.50" 

/note="strong similarity to cytochrome P450 monooxygenase 

CYP91A2, Arabidopsis thaliana, D78607 

Contains Cytochrome P450 cysteine heme-iron ligand 

signature [FGNGRRSCPG] 

contains EST gb:AA712784" 

/codon start=l 



exon 



intron 



/product="cytochrome P450 monooxygenase-like protein" 
/protein_id="CAB38207 .1" 
/db_xref="GI: 4468806" 

/trans la tion="MVTGKRYYGDEVHNEEEANVFKKLVADINDCSGARHPGDYLPFM 
KMFGGSFEKKVKALAEAMDEILQRLLEECKRDKDGNTMVNHLLSLQQNEPEYYTDVTI 
KGLMLIFCFFGQLQILWFTNIETGWGMMIAGTDTSAVTLEWAMSSLLNHPEALEKAKL 
EIDEKIGQERLIDEPDIANLPYLQNIVSETFRLYPAAPLLVPRSPTEDIKVGGYDVPR 
GTMVMVNAWAIHRDPELWNEPEKFKPERFNGGEGGGRGEDVHKLMPFGNGRRSCPGAG 
LGQKIVTLALGSLIQCFDWQKVNGEAIDMTETPGMAMRKKIPLSALCQSRPIMSKLQA 
HLKG" 

35807. .36127 
/gene="F6G17.50" 
/number=l 
36128. .36229 
/gene="F6G17.50" 



Query Match 13.7%; 
Best Local Similarity 54.1%; 
Matches 118; Conservative 



Score 59.2; DB 8; 
Pred. No. 9.2e-05; 
0; Mismatches 100; 



Length 101009; 
Indels 0; Gaps 



0; 



Qy 54 aggggcccctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcg 113 

III I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db .11133 AGGCTCGAAAGCTAATGCCGTTTGGGATGGGACGACGAGCTTGTCCTGGAGCTGAGCTTG 11074 

Qy 114 cgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacgg 173 

I MM II I I I I III Ml II I II I I I I I I I I I 

Db 11073 GGAAGCGGTTAGTGAGCCTTGCTCTTGGGTGCTTGATTCAGTCTTTCGAGTGGGAGAGAG 11014 

Qy 17 4 ttgatggagctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcc 233 

I I I I I I I I I I I M I I I I I I I M I I II I 

Db 11013 TTGGTGCAGAACTTGTGGACATGACTGAAGGCGAAGGGATCACTATGCCTAAAGCTACTC 10954 



Qy 234 cgttggaggccatgtgcangccgcgtacagctatgcgt 271 

Mill I I I I I I II I I I I I I I II II 
Db 10953 CGTTGCGAGCTATGTGCAAGGCACGTGCCATTGTTGGT 10916 



RESULT 11 

ATCHRIV87/C 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



ATCHRI V87 196339 bp DNA PLN 16-MAR-2000 

Arabidopsis thaliana DNA chromosome 4, contig fragment No. 87. 
AL161591 

AL161591.2 GI:7270703 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

1 (bases 42610 to 143618; 123423 to 196339) 

Rose,M., Hempel,S., Entian, K. -D. , Mewes,H.W., Lemcke,K. and 

Mayer, K.F.X. 

Unpublished 

2 (bases 1 to 196339) 

EU Arabidopsis sequencing, proj ect . 
Direct Submission 

Submitted ( 10-MAR-2000 ) MIPS, at the Max-Planck-Institut fuer 



Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, FRG, E-mail: 
lemcke@mips . biochem.mpg . de, mayer@mips . biochem.mpg . de Project 
Coordinator: Mike Bevan, Molecular Genetics Department, Cambridge 
Laboratory, John innes Centre, Colney Lane, NR4 7UJ Norwich, UK, 
E-mail : michael .bevan@bbsrc.ac.uk 
COMMENT Information on performance of analysis and a more detailed 

annotation of this entry and other sequences of chromosomes 3, 4 
and 5 can be viewed at: http: //www. mips .biochem.mpg. de'/proj /thai/ 
this fragment has an overlap with ATCHRIV86 at the 5' end and an 
overlap with ATCHRIV88 at the 3' end. 
FEATURES Location/Qualifiers 
source 1. .196339 

/organism="Arabidopsis thaliana" 
/variety- "Columbia" 
/db_xr e f = " t axon : 3 7 0 2 " 
/chromosome="4 " 
gene 6146. .7792 

/gene="AT4g37210" 
exon- 6146 . . 6474 > 

/gene="AT4g37210" 
/number=l 

CDS join(6146. .6474,6702. .6919,7013. .7225,7313. .7403, 

7489. .7792) 
/gene="AT4g37210" 

/note="intron number 1 is a special U12 intron 

similarity to nuclear histone-binding protein N1/N2, 

Xenopus laevis, PIR2:A25680" 

/codon_start=l 

/product="putative protein" 

/protein_id-"CAB80387 . 1" 

/db_xref="GI: 7270704" 

/trans la tion="MVEESASASEASVIQTLTEPATEIAQTLEPNLASIEATVESVVQ 

GGTESTCNNDANNNNAADSAATEVCDEEREKTLEFAEELTEKGSVFLKENDFAEAVDC 

FSRALEIRVAHYGELDAECINAYYRYGLALLAKAQAEADPLGNMPKKEGEVQQESSNG 

ESLAPSVVSGDPERQGSSSGQEGSGGKDQGEDGEDCQDDDLSDADGDADEDESDLDMA 

WKMLDIARVITDKQSTETMEKVDILCSLAEVSLEREDIESSLSDYKNALSILERLVEP 

DSRRTAELNFRICICLETGCQPKEAIPYCQKALLICKARMERLSNEIKGASGSATSST 

VSEIDEGIQQSSNVPYIDKSASDKEVEIGDLAGLAEDLEKKASKLNLSVH" 
intron 6475. .6701 

/gene="AT4g37210" 

/number=l 
exon • 6702. .6919 

/gene="AT4g37210" 

/number=2 
intron 6920. .7012 

/gene="AT4g37210" 

/number=2 
exon 7013. .7225 

/gene="AT4g37210" 

/number=3 
intron 7226. .7312 

/gene="AT4g37210" 

/number=3 
exon 7313. .7403 

/gene="AT4g37210" 

/number=4 
intron 7404. .7488 



/gene="AT4g37210" 

/number =4 

7489. .7792 

/gene="AT4g37210" 

/number=5 

8876. .9739 

/gene="AT4g37220" 

join(8876. .9051,9141. .9248,9330. .9400,9486. .9739) 
/gene="AT4g37220" 

/note="strong similarity to cold acclimation protein 
WCOR413, Triticum aestivum, PATCHX : G1657855 
Contains ProkaryOtic membrane lipoprotein lipid attachment 
site AA147-157 

contains EST gb : AW033651 . 1 , W43270, AA650647, AI996990.1, 
AA728669, T42795, Z37671, AI100742, T42949, AA040998, 
AA395771, AA657303, T41871, T45633" 
/codon_start=l 

/product="cold acclimation protein homolog" 
/protein_id="CAB80388 . 1" 
/db_xref-"GI : 7270705" 

/ 1 rans la t ion= "MGRGEFLAMKTEENAANLINS DMNEFVAAAKKLVKDVGMLGGVG 

FGTSVLQWAASIFAIYLLILDRTNWKTKMLTTLLVPYIFFTLPSVIFQFFSGDFGKWI 

ALIAIIVRLFFPKEFPEWLEIPVALILIVVVSPSLIAWTLRESWVGAVICLVIACYLF 

HEHIKASGGFKNSFTQKNGISNTIGIVALLVYPVWTIFFHIF" 

8876. .9051 

/gene="AT4g37220" 

/number=l 

9052. .9140 

/gene="AT4g37220" 

/number^l 

9141. .9248 

/gene="AT4g37220" 

/number =2 

9249. .9329 

/gene="AT4g37220" 

/number=2 

9330. .9400 

/gene="AT4g37220" 

/ number =3 

9401. '.9485 

/gene="AT4g37220" 

/number=3 

9486. .9739 

/gene="AT4g37220" 

/number- 4 

10194. .11357 

/gene="AT4g37230" 

10194. .10398 

/gene="AT4g37230" 

/number=l 

join{10194. .10398,11131. .11357) 
/gene="AT4g37230" 

/note="possible frameshift in DNA sequence at pos . 
51188-51200, 5 1 part of gene couldn't be reconstructed, 
possible pseudogehe, no ATG 

strong similarity to photosystem II oxygen-evolving 
complex protein 1, spinach, PIR2:A23626 



contains EST gb:Z34685" 
/codon_start=l 

/product="photosystem II oxygen-evolving complex like 
protein (partial) " 
/protein_id="CAB80389.1" 
/db_xref="GI: 7270706" 

/trans lation="MTSLTYTLDEIEGPFEVDYAAVTVHNFLVGSVYRSCSRSSSSWH 

RRGYFGPKSIPSAFTQGHVGNKSDQYQGYDNAVALPARGNNEELAKENNKITLSVTKS 

NPESGEVIGAFESIQPSDTDLGATTPKDVKIQGIWYCQLDE" 

10399. .11130 

/gene="AT4g37230" 

/number=l 

11131. .11357 

/gene="AT4g37230" 

/number=2 

18388. .18822 

/gene="AT4g37240" 

/number=l 

18388. .18822 

/gene="AT4g37240" 

18388. .18822 

/gene="AT4g37240" 

/note="contains EST gb : Z184 56" 

/codon_start=l 

/product="putative protein" 

/protein_id="CAB80390 .1" 

/db_xref ="GI : 7270707 " 

/trans lation="MEFANPVKVGYVLLKYPMCFICNSDDMDFDDAVAAISADEELQL 

GQI YFALPLCWLRQPLKAEEM7VALAVPCASSALMRGGGGGCRRKCVEPIVSDKLRMRVG 

SGDDTVGSGSGRRKVRNGDGGGSVSSSRRRKCYAAELSTIDE" 

21559. .23955 

/gene="AT4g37250" 

complement (21559 . .22253) 

/gene="AT4g37250" 

/number=l 

complement (join (21559. .22253, 22350. .23955) ) 
/gene="AT4g37250" 

complement (join (21559. .22253, 22350. .23955) ) 
/gene="AT4g37250" 

/note="similarity to protein kinase TMK1, receptor type 
precursor - Arabidopsis thaliana, PIR1:JQ1674 
contains EST gb:H76836" 
/codon_start=l 

/product="receptor kinase-like protein" 
/protein__id="CAB80391 .1" 
/db_xref="GI : 7270708" 

/trans lation=="MELISVIFFFFCSVLSSSALNSDGLVLMKFKSSVLVDPLSLLQT 
WNYKHESPCSWRGISCNNDSKVLTLSLPNSQLLGSIPSDLGSLLTLQSLDLSNNSFNG 
PLPVSFFNARELRFLDLSSNMISGEIPSAIGDLHNLLTLNLSDNALAGKLPTNLASLR 
NLTVVSLENNYFSGEIPGGWRVVEFLDLSSNLINGSLPPDFGGYSLQYLNVSFNQISG 
EIPPEIGVN FPRNVTVDLSFNNLTGPIPDSPVFLNQESNFFSGNPGLCGEPTRNPCLI 
PSSPSIVSEADVPTSTPAIAAIPNTIGSNPVTDPNSQQTDPNPRTGLRPGVIIGIVVG 
DIAGIGILAVIFLYIYRCKKNKIVDNNNNDKQRTETDTITLSTFSSSSSSPEESRRFR 
KWSCLRKDPETTPSEEEDEDDEDEESGYNANQRSGDNKLVTVDGEKEMEIETLLKASA 
YILGATGSSIMYKAVLEDGRVFAVRRLGENGLSQRRFKDFEPHIRAIGKLVHPNLVRL 
CGFYWGTDEKLVIYDFVPNGSLVNPRYRKGGGSSSPYHLPWETRLKIAKGIARGLAYL 
HEKKHVHGNLKPSNILLGHDMEPKIGDFGLERLLTGETSYIRAGGSSRIFSSKRYTTS 



intron 
exon 

gene 
CDS 



SREFSSIGPTPSPSPSSVGAMSPYCAPESFRSLKPSPKWDVYGFGVILLELLTGKIVS 

VEEIVLGNGLTVEDGHRAVRMADVAIRGELDGKQEFLLDCFKLGYSCASPVPQKRPTM 

KESLAVLERFHPNSSVIKSSSFHYGH" 

complement (22254 . . 22349) 

/number=l 

complement (22350. .23955) 

/gene="AT4g37250" 

/number=2 

34372. .35334 

/gene="AT4g37260" 

34372. .35334 



Query Match 13.7%; Score 59.2; DB 8; 

Best Local Similarity 54.1%; Pred. No. 8.7e-05; 
Matches 118; Conservative 0; Mismatches 100; 



Length 196339; 
Indels 0; Gaps 



0; 



Qy 54 aggggcccctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcg 113 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 537 41 AGGCTCGAAAGCTAATGCCGTTTGGGATGGGACGACGAGCTTGTCCTGGAGCTGAGCTTG 53 682 

Qy 114 cgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacgg 173 

I I I I I II I I I I I I I III II I I I I I I I I I I I I 

Db 53681 GGAAGCGGTTAGTGAGCCTTGCTCTTGGGTGCTTGATTCAGTCTTTCGAGTGGGAGAGAG 53622 

Qy 174 ttgatggagctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcc 233 

I I I II I I I I I I II I I I I I I I I I I I II I 

Db 53621 TTGGTGCAGAACTTGTGGACATGACTGAAGGCGAAGGGATCACTATGCCTAAAGCTACTC 53562 

Qy 234 cgttggaggccatgtgcangccgcgtacagctatgcgt 271 

I I I I I I I I I I I I II I I I I I I II II 
Db 53561 CGTTGCGAGCTATGTGCAAGGCACGTGCCATTGTTGGT 53524 



RESULT 12 

ATAP21 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



ATAP21 206420 bp DNA PLN 30-JUL-1999 

Arabidopsis thaliana DNA chromosome 4, ESSA I AP2 contig fragment 
No. 1. 
Z99707 

Z99707.1 GI:4376087 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales ; • Brassicaceae ; Arabidopsis. 

1 (bases 1 to 206420) 

Bevan,M., Terryn,N., Vos,P., Heijnen,L., Mewes,H.W., Mayer, K.F.X. 

and Schueller,C. 

Unpublished 

2 (bases 1 to 206420) 

EU Arabidopsis sequencing, pro j ect . 
Direct Submission 

Submitted ( 2 9- JUL-1 999) MIPS, at the Max-Planck-Institut fuer 
Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, FRG, E-mail: 
schuelle@mips . biochem. mpg . de, mayer@mips .biochem.mpg.de Project 
Coordinator: Mike Bevan, Molecular Genetics Department, Cambridge 



FEATURES 

source 



Laboratory, John Innes Centre, Colney Lane, NR4 7UJ Norwich, UK, 
E-mail : michael . bevan@bbsrc .ac.uk 
COMMENT On Mar 7, 1999 this sequence version replaced gi: 4006849. 

Information on performance of analysis and ,a more detailed 
annotation of this entry and .other sequences of chromosomes 3, 4 
and 5 can be viewed at: http://www.mips.biochem.mpg.de/proj/thal/ 
this fragment has an overlap with ATAP22 at the 3' end. 
Location/Qualifiers 
1. .206420 

/organism="Arabidopsis t ha liana 
/variety=" Columbia" 
/db_xref="taxon:3702" 
/ chromosome=" 4 " 
source 1. .116845 

/organism="Arabidopsis thaliana" 
/db_xref="taxon:3702" 
/clone="BAC TAMU8H13" 
exon 3. .560 

/gene="C7A10.10" 
/number=l 
gene 3. .560 

/gene="C7A10.10" 
CDS <3. .560 

/gene="C7A10.10" 

/note="similarity to cytochrome P450, Nicotiana tabacum, 
PATX:E1188611 

Contains Cytochrome P450 cysteine heme-iron ligand 
signature [ FGLGRRACPG] " 
/codon__start=l 

/product="cytochrome like ' protein" 
/protein_id="CABl 6768.1" 
/db_xref="GI : 4006850" 

/trans la tion="SLLNNPEVLNKARDEIDRMIGVDRLLEESDIPNLPYLQNIVSET 
LRLYPAAPMLLPHVASKDCKVGGYDMPRGTMLLTNAWAIHRDPLLWDDPTSFKPERFE 
KEGEAKKLMPFGLGRRACPGSGLAQRLVTLSLGSLIQCFEWERIGEEEVDMTEGPGLT 
MPKARPLEAMCRARDFVGKILPDSS" 
gene 978. .2731 

/gene="C7A10.20" 
CDS join{978. .1475,1630. .2016,2114. .2731) 

/gene="C7A10.20" 

/note="similarity to cytochrome P450 Glycyrrhiza echinata, 
PATX:D1023287 

Contains Cytochrome P450 cysteine heme-iron ligand 
signatures [FGLGRRACPG] and 474 : [FGLGRRACPG] " 
/codon_start=l 

/product="cytochrome P450-like protein" 
/protein_id="CAB16769.1" 
/db_xref="GI : 4006851" 
/db_xref="SPTREMBL: 023154" 

/translat ion="MEGQTLIFTFLFISLSLTFIIGRIKRRPNLPPSPSWALPVIGHL 
RLLKPPLHRVFLSVSESLGDAPIISLRLGNRLVFVVSSHSLAEECFTKNDVVLANRFN 
SLASKHISYGCTTVVTASYGDHWRNLRRIGAVEIFSAHRLNSFSSIRRDEIHRLIACL 
SRNSSLEFTKVEMKSMFSNLTFNNIIRMLAGKCYYGDGAEDDPEAKRVRELIAEGMGC 
FGAGNTADYLPILTWITGSEKRIKKIASRLDEFLQGLVDERREGKEKRQNTMVDHLLC 
LQETQPEYYTDNIIKGIMLSLILAGTDTSAVTLEWTLSALLNHPQILSKARDEIDNKV 
GLNRLVEESDLSHLPYLQNIVSESLRLYPASPLLVPHVASEDCKVGGYHMPRGTMLLT 
NAWAIHRDPKIWDDPTSFKPERFEKEGEAQKLLGFGLGRRACPGSGLAQRLASLTIGS 



LIQCFEWERIGEEEVDMTEGGGGVIMPKAIPLVAMCKARPVVGKILNESA" 

978. .1475 

/gene="C7A10.20" 

/ number =1 

1476. .1629 

/gene="C7A10.20" 

/number-1 

1630. .2016 

/gene="C7A10.20" 

/number=2 

2017. .2113 

/gene="C7A10.20" 

/ number =2 

2114. .2731 

/gene="C7A10.20" 

/number=3 

3115. .3615 

/gene="C7A10.30" 

/number=l 

3115. .5137 

/gene="C7A10.30" 

join(3115. .3615, 4090. .4464, 4535. .5137.) 
/gene="C7A10.30" 

/note="strong similarity to cytochrome P450, Glycyrrhiza 
echinata, PATX : D1023287 

Contains Cytochrome P450 cysteine heme-iron ligand 
signature 431 : [FGMGRRACPG] 

contains EST gb:T43640, Aa395149, T42716, T41670" 
/ codon__start=l 

/product="cytochrome P450-like protein" 
/protein_id="CAB167 53 . 1" 
/db_xref="GI :2464850" 
/db_xr e f = " S PTREMBL : 02 3 1 5 5 " 

/trans la tion="MDLNQILILSFLSLFT LAI FLLTRSKRKLNLPPSPAISLPVIGH 

LHLLKPPLHRTFLSLSKSIGNAPVFHLRLGNRLVYVISSRSIAEECFTKNDVVLANRP 

KFTISKHLGYNATYLLSASYGDHWRNLRRI7VAVEIFSTHRLNSFLYIRKDEIRRLISH 

LSRDSLHGFVEVEMKTLLTNLASNTTIRMLAGKRYFGEDNDDAKLVKNLVSEAVTSAG 

AGNPIDYLSILRWVSSYEKRIKNLGNRFDTFLQKLVDEKRAEKEKGETMIDHLLALQD 

IQPDYYTDVI IKGIILTLIIAGTDTSSVTLEWAMSNLLNHPEILKKARMEIDEKVGLD 

RLVDESDIVNLSYLQSIVLETLRMYPAVPLLLPHLSSEDCKVGGYDIPSGTMVLTNAW 

AMHRDPEVWEDPEIFKPERFEKEGEAEKLISFGMGRRACPGAGLAHRLINQALGSLVQ 

CFEWERVGEDFVDMTEDKGATLPKAIPLRAMCKARSIVDKLI" 

3616. .4089 

/gene="C7A10.30" 

/number=l 

4090. .4464 

/gene="C7A10.30" 

/ number=2 

4465. .4534 

/gene="C7A10.30" 

/ number=2 

'4535. .5137 

/gene="C7A10.30" 

/number=3 

5994. .7942 

/gene="C7A10.40" 

join(5994. .6494,6880. .7263,7340. .7942) 



/gene="C7A10.40" 

/note="similarity to cytochrome P4-50 monooxygenase, 
Arabidopsis thaliana, PATchX : D1029478 

Contains ATP/GTP-binding site motif A (P-loop) [ARAIVGKT] , 
Cytochrome P450 cysteine heme-iron ligand signature 
[FGMGRRACPG] " 
/codon_start=l 

/product="cytochrome P450-like protein" 

/protein_id= r, CAB16770 . 1" 

/db_xref="GI : 4376088" 

/ db_xr e f = " S PTREMBL : 02 3 1 5 6 " 

/trans lation="MDLTQILLLSFLFLTISIKLLLTKSNRKPNLPPSPAYPLPVIGH 

LHLLKQPVHRTFHSISKSLGNAPIFHLRLGNRLVYVISSHSIAEECFTKNDVVLANRP 

DIIMAKHVGYNFTNMIAASYGDHWRNLRRIAAVEIFSSHRISTFSSIRKDEIRRLITH 

LSRDSLHGFVEVELKSLLTNLAFNNIIMMVAGKRYYGTGTEDNDEAKLVRELIAEIMA 

GAGSGNLADYLPSINWVTNFENQTKILGNRLDRVLQKLVDEKRAEKEKGQTLIDHLLS 

FQETEPEYYTDVI IKGI ILALVLAGTDTSSVTLEWAMSNLLNHPEILEKARAEIDDKI 

GSDRLVEESDIVNLHYLQNIVSETLRLYPAVPLLLPHFSSDECKVAGYDMPRRTLLLT 

NVWAMHRDPGLWEEPERFKPERFEKEGEARKLMPFGMGRRACPGAELGKRLVSLALGC 

LIQSFEWERVGAELVDMTEGEGITMPKATPLRAMCKARAIVGKTI" 

5994. .6494 

/gene="C7A10.40" 

/number=l 

6495. .6879 

/gene="C7A10.40" 

/number-1 ' 

6880. .7263 

/gene="C7A10.40" 

/number =2 

7264. .7339 

/gene="C7A10. 40" 

/number=2 

7340. .7942 

/gene-"C7A10. 40" 

/number=3 

8851. .9360 

/gene="C7A10.50" 

/number=l 

8851. .11532 

/gene="C7A10.50" 

join(8851. .9360,9980. .10369,1087 6. .11532) 
/gene="C7A10.50" 

/note="similarity to cytochrome P450 Glycyrrhiza echinata, 
PATCHX: D102328 

Contains Cytochrome P450 cysteine heme-iron ligand 
signature [ FGLGRRACPG] 
contains EST gb:F13573, F13574" 
/codon_start=l 

/product="cytochrome P450-like protein" 
/protein_id="CAB16771 . 1" 
/db_xref="GI : 4006853" 

/trans la tion="MDCILLILTTLVAIFIVKIVLLVTKPNKNLPPSPNICFPIIGHL 
HLLKKPLLHRTLSHLSHSLGPVFSLRLGSRLAVIISSPTAAEECFLTKNDIVLANRPR 
FIMGKYVAYDYTSMVTAPYGDHWRNLRRITALEVFSTNRLNASAEIRHDEVKMLLQKL 
HDLSVERPAKVELRQLLTGLTLNVIMRMMTGKRFFEEDDGGKAGISLEFRELVAEILE 
LSAADNPADFLPALRWFDYKGLVKRAKRIGERMDSLLQGFLDEHRANKDRLEFKNTMI 
AHLLDSQEKEPHNYSDQTIKGLILMMVVGGTDTSALTVEWAMSNLLNHPQILETTRQN 



IDTQMETSSSRRLLKEEDLVNMNYLKNVVSETLRLYPVAPLMVPHVPSSDCVIGGFNV 
PRDTIVLVNLWAIHRDPSVWDDPTSFKPERFEGSDQFGHYNGKMMPFGLGRRACPGLS 
LANRVVGLLLGSMIQCFEWESGSGGQVDMTEGPGLSLPKAEPLVVTCRTREMASELLF 
FGSEPSNKNV" 
intron 9361. .9979 

/gene="C7A10.50" 
/number=l 



Query Match 13.7%; Score 59.2; DB 8; Length 206420; 

Best Local Similarity 54.1%; Pred. No. 8.7e-05; 

Matches 118; Conservative 0; Mismatches 100; Indels 0; Gaps 0; 

Qy 54 aggggcccctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcg 113 

III I III I! II II II I I I I I I I I I I I I 

Db 7 713 AGGCTCGAAAGCTAATGCCGTTTGGGATGGGACGACGAGCTTGTCCTGGAGCTGAGCTTG 7772 

Qy 114 cgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacgg 173 

III! II I I I I III III II I I I I I I I I I I I I 

Db 7773 GGAAGCGGTTAGTGAGCCTTGCTCTTGGGTGCTTGATTCAGTCTTTCGAGTGGGAGAGAG 7832 



Qy 174 ttgatggagctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcc 233 

I I I I I I I I I I II I I I I I I I I I I I II I 

Db 7 833 TTGGTGCAGAACTTGTGGACATGACTGAAGGCGAAGGGATCACTATGCCTAAAGCTACTC 78 92 



Qy 234 cgttggaggccatgtgcangccgcgtacagctatgcgt 271 

I'M II I I M I I I I I I I I I II II 
Db 7893 CGTTGCGAGCTATGTGCAAGGCACGTGCCATTGTTGGT 7930 



RESULT 13 

AY039844 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 



AY039844 1656 bp mRNA PLN 24-JUN-2001 

Arabidopsis thaliana AT4g37430/F6G17_80 mRNA, complete cds . 
AY039844 

AY039844 .1 GI:14532439 
FLI_CDNA. 
thale cress. 
Arabidopsis -thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta ; Magnoliophyta ; eudicotyledons; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae; Arabidopsis. 

1 (bases 1 to 1656) 

-Cheuk,R., Chen f H., Kim,C.J., Koesema,E., Meyers, M.C., Banh,J., 
Bowser, L., Carninci,P., Dale,J.M., Gibson, H. A., Goldsmith, A . D . , 
Hayashizaki , Y . , Ishida,J., Jiang, P . X .,' Jones , T . , Kamiya,A., 
Karlin-Neumann, G . , Kawai,J., Lam,B., Lee,J.M., Lin, J., Liu,S.X., 
Miranda, M., Narusaka,M., Nguyen, M., Onodera , C . S . , Palm, C. J., 
Pham,P.K., Quach, H . L . , Sakurai,T., Satou,M., Seki,M., Southwick, A. , 
Tang,C.C, Toriumi,M., Yamada,K., Yu,G., Yu,S., Shinozaki, K. , 
Davis, R.W., Theologis,A. and Ecker,J.R. 
Arabidopsis cDNA clones 
Unpublished' 

2 (bases 1 to 1656) 

Cheuk,R., Chen,H., Kim,C.J., Koesema,E., Meyers, M.C., Banh,J., 
Bowser, L., Carninci,P., Dale, J. M., Gibson, H. A., Goldsmith, A. D . , 
Hayashizaki, Y . , Ishida,J., Jiang, P. X., Jones, T., Kamiya,A., 
Karlin-Neumann, G. , Kawai,J., Lam,B., Lee,J.M., Lin, J., Liu,S.X., 



TITLE 
JOURNAL 



COMMENT 



Miranda, M., Narusaka, M., Nguyen, M., Onodera, C . S . , Palm, C. J., 
Pham,P.K., Quach, H.L. , Sakurai,T., Satou,M., Seki,M., Southwick, A. , 
Tang,C.C, Toriumi,M., Yamada,K., Yu,G., Yu,S., Shinozaki, K. , 
Davis, R.W., Theologis,A. and Ecker,J.R. 
Direct Submission 

Submitted (06-JUN-2001 ) Salk Institute Genomic Analysis Laboratory 
(SIGnAL) , Plant Biology Laboratory, The Salk Institute for 
Biological Studies, 10010 N. Torrey Pines Road, La Jolla, CA 92037, 
USA 

RIKEN Genomic Sciences Center (GSC) members carried out the 
collection and clustering of RAFL cDNAs (RAFL cDNA : 'RIKEN 
Arabidopsis Full-Length cDNA 1 ) : Seki,M., Narusaka, M., Ishida,J., 
Satou,M., Kamiya,A., Sakurai,T., Carninci,P., Kawai,J., 
Hayashizaki, Y. and Shinozaki,K. 



The Salk, Stanford, PGEC (SSP) Consortium members carried out the 
sequencing and annotation of the RAFL cDNAs : Cheuk,R., Chen, H., 
Kim,C.J., Koesema,E., Meyers, M.C., Shinn,P., Banh,J. Bowser, L., 
Dale, J. M., Gibson, H , A. , Goldsmith, A. D . , Jiang, P. X., Jones, T., 
Karlin-Neumahn, G . , "Lam, B., Lee,J.M., Lin, J., Liu,S.X., Miranda, M., 
Nguyen, M., Onodera, C . S . , Palm, C. J., Pham,P.K., Quach, H.L., 
Southwick, A. , Tang,C.C, Toriumi,M., Yamada,K., Yu,G., Yu,S., 
Davis, R.W., Theologis , A. , and Ecker,J.R. 



Cheuk,R. (SSP/Salk) and Seki,M. (RIKEN GSC) contributed equally to 
this work. Shinozaki,K. (RIKEN GSC) and Ecker,J.R. (SSP/Salk) 
contributed equally to this work as Pis. 
FEATURES Location/Qualifiers 
source 1 . . 1656 

/organism= "Arabidopsis t ha liana" 
/db_xref="taxon: 3702" 
/chromosome^" 4 " 

/clone="RAFL05-02-B21 (R12849) " 

/note="ecotype : Columbia" 
5'UTR 1. .59 

CDS 60. .1562 

/note="cytochrome P450 monooxygenase (CYP91A2)" 

/codon_start=l 

/product="AT4g374 30/F6G17_80" 
/protein_id="AAK63948 . 1 " 
/db_xref-"GI: 14532440" 

/trans la tion="MLYFILLPLLFLVISYKFLYSKTQRFNLPPGPPSRPFVGHLHLM 
KPPIHRLLQRYSNQYGPIFSLRFGSRRVVVITSPSLAQESFTGQNDIVLSSRPLQLTA 
KYVAYNHTTVGTAPYGDHWRNLRRICSQEILSSHRLINFQHIRKDEILRMLTRLSRYT 
QTSNESNDFTHIELEPLLSDLTFNNIVRMVTGKRYYGDDVNNKEEAELFKKLVYDIAM 
YSGANHSADYLPILKLFGNKFEKEVKAIGKSMDDILQRLLDECRRDKEGNTMVNHLIS 
LQQQQPEYYTDVIIKGLMMSMMLAGTETSAVTLEWAMANLLRNPEVLEKARSEIDEKI 
GKDRLIDESDIAVLPYLQNVVSETFRLFPVAPFLIPRSPTDDMKIGGYDVPRDTIVMV 
NAWAIHRDPEIWEEPEKFNPDRYNDGCGSDYYVYKLMPFGNGRRTCPGAGLGQRIVTL 
ALGSLIQCFEWENVKGEEMDMSESTGLGMRKMDPLRAMCRPRPIMSKLLL" 
3 1 UTR 1563. .1656 

BASE COUNT 503 a 410 c 348 g 395 t 

ORIGIN 



Query Match 

Best Local Similarity 



13.7%; 
52.7%; 



Score 59; DB 8; Length 1656; 
Pred. No. 0.00014; 



Matches 125; Conservative 0; Mismatches 112; Indels 0; Gaps 0; 



Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I I III III I I I I I I I I I I I I I I I I I I I I 
Db* 1308 GACGGATGCGGAAGCGATTACTATGTTTACAAGCTGATGCCGTTTGGGAATGGCCGGAGA 1367 



Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I I I I I I I I I II I I I I I II I I I I II I I I I I I I 

Db 1368 ACTTGTCCCGGCGCCGGATTAGGTCAGAGGATTGTGACATTGGCGCTTGGATCGTTGATT 14 27 

Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

I I I I I I I I I I I I I I II I I I I III I I III 

Db 14 28 C AAT GCT T T G7VAT GGG AG AAT GT G AAAGGGGAAG AG AT GG AT AT GTCT GAG AGT AC TGGG 14 87 

Qy 212 ctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatg 2 68 

II I I I I I I II I I I I I I I I II I I I I I I I I I I I 

Db 1488 TTGGGTATGCGTAAGATGGATCCTTTACGGGCCATGTGTAGGCCTAGGCCCATTATG 1544 



RESULT 14 

ATF6G17 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



misc feature 



ATF6G17 101009 bp DNA PLN 03-MAR-1999 

Arabidopsis thaliana DNA chromosome 4, BAC clone F6G17 (ESSA 
project) . 
AL035601 

AL035601.1 GI:4468801 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta ; Magnoliophyta ; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

1 (bases 1 to 101009) 

Bevan,M., Koetter,P., Hempel,S., Entian, K . -D . , Bancroft, I., 

Mewes,H.W., Mayer, K.F.X. and Schueller,C. 

Unpublished 

2 (bases 1 to 101009) 

EU Arabidopsis sequencing, proj ect . 
Direct Submission 

Submitted ( 03-MAR-l 999 ) MIPS, at the Max-Planck-Institut fuer 
Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, FRG, E-mail: 
schuelledmips . biochem.mpg . de , mayer@mips . biochem.mpg . de Project 
Coordinator: Mike Bevan, Molecular Genetics Department, Cambridge 
Laboratory, John Innes Centre, Colney Lane, NR4 7UJ Norwich, UK, 
E-mail : michael . bevan@bbsrc .ac.uk 

Information on. performance of analysis and a more detailed 
annotation of this entry and other sequences of chromosome 4 can be 
viewed at : http : //websvr .mips .biochem.mpg . de/proj /thai/ . 

Location/Qualifiers 

1. .101009 

/organism=" Arabidopsis thaliana" 
/variety= "Columbia" 
/db_xref="taxon:3702" 
/chromosome="4 " 
1. .18845 

/note="position 1-18845 overlap to EMBL accession Z99707; 
please refer to this entry for analysis and annotation" 



complement (join (1828 6. .18900,19005. .19391, 19523. .20020) ) 

/gene="F6G17.10" 

18286. .20020 

/gene="F6G17.10" 

complement (join (1828 6. .18 900,19005. .19391, 19523. .20020) ) 
/gene="F6G17.10" 

/note="similarity to cytochrome P450 monooxygenase, 
Arabidopsis thaliana, D78606 

Contains Cytochrome P450 cysteine heme-iron ligand 

signature [FGLGRRACPG] " 

/codon_start=l 

/product="cytochrome p450-like protein" 
/protein_id="CAB38203. 1" 
/db_xref="GI: 4468802" 

/trans lation="MEALMLIFTFCFIVLSLIFLIGRIKRKLNLPPSPAWALPVIGHL 

RLLKPPLHRVFLSVSQSLGDAPI ISLRLGNRLLFVVSSHSIAEECFTKNDVILANRQT 

TISTKHISYGNSTVVSASYSEHWRNLRRIGALEIFSAHRLNSFSSIRRDEIRRLIGRL 

LRNS S YGFTKVEMKSMFS DLT FNN 1 1 RMLAGKC YYGDGKEDDPEAKRVRTL I AEAMS S 

SGPGNAADYIPILTWITYSETRIKKLAGRLDEFLQGLVDEKREGKEKKENTMVDHLLC 

LQETQPEYYMDRIIKGTMLSLIAGGTDTTAVTLEWALSSLLNNPEVLNKARDEIDRMI 

GVDRLLEESDIPNLPYLQNIVSETLRLYP7\APMLLPHVASKDCKVGGYDMPRGTMLLT 

NAWAIHRDPLLWDDPTSFKPERFEKEGEAKKLMPFGLGRRACPGSGLAQRLVTLSLGS 

LIQCFEWERIGEEEVDMTEGPGLTMPKARPLEAMCRARDFVGKILPDSS" 

complement (1828 6. .18900) 

/gene="F6G17.10" 

/number=l 

complement (18 901 . .19004) 
/number=l 

complement (19005. .19391) 
/gene="F6G17 . 10" 
/number=2 

complement (19392 . .19522) 
/number =2 

complement (19523. .20020) 
/gene="F6G17.10" 
/number =3 
21133. .22840 
/gene="F6G17.20" 
complement (21133. .21744) 
/gene="F6G17.20" 
/number=l 

complement (join (21133. .217 44,21842. . 22225 , 2234 3 . .22840) ) 
/gene="F6G17.20" 

complement (join (21133. .2174 4,21842. .22225,22343. .22840) ) 
/gene="F6G17.20" 

/note-"similarity to cytochrome P450, Glycyrrhiza 
echinata, AB001379 

Contains Cytochrome P450 cysteine heme-iron ligand 
signature [FGLGRRACPG] 

contains EST gb:AA586064, H76015, T41596, N38867" 
/codon_start=l 

/product="cytochrome P450-like protein" 
/protein_id="CAB38204 .1" 
/db_xref ="GI ; 4468803" 

/translation="METKTLIFSILFVVLSLIYLIGKLKRKPNLPPSPAWSLPVIGHL 
RLLKPPIHRTFLSLSQSLNNAPIFSLRLGNRLVFVNSSHSIAEECFTKNDVVLANRPN 
FILAKHVAYDYTTMIAASYGDHWRNLRRIGSVEIFSNHRLNSFLSIRKDEIRRLVFRL 



SRN FSQEFVKVDMKSMLS DLT FNN I LRMVAGKRY YGDGVEDDPEAKRVRQL I ADVVAC 
AGAGNAVD YL PVLRLVS D YET RVKKLAGRL DE FLQGLVDEKREAKEKGNTM I DHLLTL 
QESQPDYFTDRIIKGNMLALILAGTDTSAVTLEWALSNVLNHPDVLNKARDEIDRKIG 
LDRLMDESDISNLPYLQNIVSETLRLYPAAPMLLPHVASEDCKVAGYDMPRGTILLTN 
VWAIHRDPQLWDDPMSFKPERFEKEGEAQKLMPFGLGRRACPGSGLAHRLINLTLGSL 
IQCLEWEKIGEEVDMSEGKGVTMPKAKPLEAMCRARPSVVKIFNESV" 
intron complement (2174 5 . .21841) 

/number=l 

exon complement (21842 . .22225) 

/gene="F6G17.20" 
/number=2 

intron complement (22226 . .22342) 

/number =2 

exon complement (22343. .22840) 

/gene="F6G17 .20" 

/number=3 
gene 23202. .25100 

/gene="F6G17.30" 
CDS complement (23202. .25100) 

/gene="F6G17.30" 

/note="similarity to various predicted proteins, 
Arabidopsis thaliana 

Contains Cytochrome c family heme-binding site signature 

[CSDCHT] " 

/codon__start=l 

/product="putative protein" 

/protein_id="CAB38205. 1" 

/db_xref="GI: 4468804" 

/trans lation="MASSPLLATSLPQNQLSTTATARFRLPPPEKLAVLIDKSQSVDE 
VLQIHAAILRHNLLLHPRYPVLNLKLHRAYASHGKIRHSLALFHQTIDPDLFLFTAAI 
NTASINGLKDQAFLLYVQLLSSEINPNEFTFSSLLKSCSTKSGKLIHTHVLKFGLGID 
PYVATGLVDVYAKGGDVVSAQKVFDRMPERSLVSSTAMITCYAKQGNVEAARALFDSM 
CERDIVSWNVMIDGYAQHGFPNDALMLFQKLLAEGKPKPDEITVVAALSACSQIGALE 
TGRWIHVFVKSSRIRLNVKVCTGLIDMYSKCGSLEEAVLVFNDTPRKDIVAWNAMIAG 
YAMHGYSQDALRLFNEMQGITGLQPTDITFIGTLQACAHAGLVNEGIRIFESMGQEYG 
IKPKIEHYGCLVSLLGRAGQLKRAYETIKNMNMDADSVLWSSVLGSCKLHGDFVLGKE 
IAEYLIGLNIKNSGIYVLLSNIYASVGDYEGVAKVRNLMKEKGIVKEPGISTIEIENK 
VHEFRAGDREHSKSKEIYTMLRKISERIKSHGYVPNTNTVLQDLEETEKEQSLQVHSE 
RLAIAYGLISTKPGSPLKIFKNLRVCSDCHTVTKLISKITGRKIVMRDRNRFHHFTDG 
SCSCGDFW" 

exon complement (23202. .25100) 

/gene="F6G17.30" 
/number=l 

gene ■ complement ( 23202 . .25100) 

/gene="F6G17.30" 
gene 30884. .32930 

/gene="F6G17.40" 
exon 30884. .31206 

/gene="F6G17.40"' 

/number=l 

CDS join(30884. .3120 6,31359. .314 60,3154 4. .32930) 

/gene="F6G17 . 40" 

/note="similarity to hypothetical protein, Glycine max, 
PIR2:S17433 

contains EST gb:T44614, AAA712644, AA395496" 
/codon_start=l 

/product="auxin-responsive GH3-like protein" 



/protein_id="CAB38206.1" 
/db_xref="GI: 4468805" 

/trans lation="MAVDSPLQSRMVSATTSEKDVKALKFIEEMTRNPDSVQEKVLGE 
ILTRNSNTEYLKRFDLDGVVDRKTFKSKVPVVTYEDLKPEIQRISNGDCSPILSSHPI 
TEFLTSSGTSAGERKLMPTIEEDLDRRQLLYSLLMPVMNLYVPGLDKGKGLYFLFVKS 
ESKTSGGLPARPVLTSYYKSDHFKRRPYDPYNVYTSPNEAILCSDSSQSMYAQMLCGL 
LMRHEVLRLGAVFASGLLRAISFLQNNWKELARDISTGTLSSRIFDPAIKN RMS KILT 
KPDQELAEFLVGVCSQENWEGIITKIWPNTKYLDVIVTGAMAQYIPTLEYYSGGLPMA 
CTMYASSESYFGINLKPMCKPSEVSYTIMPNMAYFEFLPHNHDGDGAAEASLDETSLV 
ELANVEVGKEYELVITTYAGLYRYRVGDILRVTGFHNSAPQFKFIRRKNVLLSVESDK 
TDEAELQKAVENASRLFAEQGTRVIEYTSYAETKTIPGHYVIYWELLGRDQSNALMSE 
EVMAKCCLEMEESLNSVYRQSRVADKSIGPLEIRVVRNGTFEELMDYAISRGASINQY 
KVPRCVSFTPIMELLDSRVVSAHFSPSLPHWSPERRR" 

intron 31207. .31358 

/gene="F6G17.40" 
/number =1 

exon 31359. .31460 

/gene="F6G17.40" 
/number=2 

intron 31461. .31543 

/gene="F6G17.40" 
/number =2 

exon 31544. .32930 

/gene="F6G17.40" 
/number=3 

gene 35807. .37359 

/gene="F6G17.50" 

CDS join{35807. .36127,36230. .3 628 9,36724. .37359) 

/gene-"F6G17.50" 

/note="strong similarity to cytochrome P450 monooxygenase 

CYP91A2 , Arabidopsis thaliana, D78607 

Contains Cytochrome P450 cysteine heme-iron ligand 

signature [FGNGRRSCPG] 

contains EST gb:AA712784" 

/codon_start=l 

/product="cytochrome P450 monooxygenase-like protein" ' 
/protein_id="CAB38207 .1" 
/db_xref="GI: 4468806" 

/trans lation="MVTGKRYYGDEVHNEEEANVFKKLVADINDCSGARHPGDYLPFM 
KMFGGSFEKKVKALAEAMDEILQRLLEECKRDKDGNTMVNHLLSLQQNEPEYYTDVTI 
KGLMLIFCFFGQLQILWFTNIETGWGMMIAGTDTSAVTLEWAMSSLLNHPEALEKAKL 
EIDEKIGQERLIDEPDIANLPYLQNIVSETFRLYPAAPLLVPRSPTEDIKVGGYDVPR 
GTMVMVNAWAIHRDPELWNEPEKFKPERFNGGEGGGRGEDVHKLMPFGNGRRSCPGAG 
LGQKIVTLALGSLIQCFDWQKVNGEAIDMTETPGMAMRKKIPLSALCQSRPIMSKLQA 
HLKG" 

exon 35807. .36127 

/gene="F6G17.50" 

/number=l 
intron 36128. .36229 

/gene="F6G17.50" 



Query Match 13.7%; Score 59; DB 8; Length 101009; 

Best Local Similarity 52.7%; Pred. No. 0.0001; 

Matches 125; Conservative 0; Mismatches 112; Indels 0; Gaps 0; 

Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I II I III III I I I I I I I I I I I I I I I I I I I I 



Db 4 9737 GACGGATGCGGAAGCGATTACTATGTTTACAAGCTGATGCCGTTTGGGAATGGCCGGAGA 4 9796 

Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I I I I I I I I I I I I I I I I II II I I I I I I I I II I 

Db 4 97 97 ACTTGTCCCGGCGCCGGATTAGGTCAGAGGATTGTGACATTGGCGCTTGGATCGTTGATT 4 9856 

Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

' I II II I I I I I I I I I II I I I I Ml .1 I III 

Db 4 9857 CAATGCTTTGAATGGGAGAATGTGAAAGGGGAAGAGATGGATATGTCTGAGAGTACTGGG 4 9916 



Qy 212 ctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatg 2 68 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 49917 TTGGGTATGCGTAAGATGGATCCTTTACGGGCCATGTGTAGGCCTAGGCCCATTATG 4 9973 



RESULT 15 

ATCHRIV87 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



gene 



exon 



CDS 



ATCHRIV87 196339 bp DNA PLN 16-MAR-2000 

Arabidopsis thaliana DNA chromosome 4, contig fragment No. 87. 
AL161591 

AL161591.2 GI:7270703 

thale cress. 
Arabidopsis thaliana 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta ; Magnoliophyta; eudicotyledons ; core eudicots; 
Rosidae; eurosids II; Brassicales; Brassicaceae ; Arabidopsis. 

1 (bases 42610 to 143618; 123423 to 196339) 

Rose,M., Hempel,S., Entian, K. -D. , Mewes,H.W., Lemcke,K. and 

Mayer, K.F.X. 

Unpublished 

2 (bases 1 to 196339) 

EU Arabidopsis sequencing, proj ect . 
Direct Submission 

Submitted ( 10-MAR-2000 ) MIPS, at the Max-Planck-Institut fuer 
Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, FRG, E-mail: 
lemcke@mips . biochem, mpg . de, mayer@mips .biochem.mpg.de Project 
Coordinator: Mike Bevan, Molecular Genetics Department, Cambridge 
Laboratory, John Innes Centre, Colney Lane, NR4 7UJ Norwich, UK, 
E-mail : michael . bevan@bbsrc .ac.uk 

Information on performance of analysis and a more detailed 
annotation of this entry and other sequences of chromosomes 3, 4 
and 5 can be viewed at: http://www.mips.biochem.mpg.de/proj/thal/ 
this fragment has an overlap with ATCHRIV8 6 at the 5 ! end and an 
overlap with ATCHRIV88 at the 3' end. 

Location/Qualifiers 

1. .196339 

/organism= "Arabidopsis thaliana" 

/variety^ "Columbia" 

/db_xref="taxon: 3702" 

/chromosome="4 " 

6146. .7792 

/gene="AT4g37210" 

6146. .6474 

/gene="AT4g37210" 

/ number=l 

join(6146. .6474,6702. .6919,7013. .7225,7313. .7403, 



7489. .7792) 
/gene="AT4g37210" 

/note="intron number 1 is a special U12 intron 

similarity to nuclear histone-binding protein N1/N2, 

Xenopus laevis, PIR2:A25680" 

/codon_start=l 

/product="putative protein" 

/protein_id="CAB80387 . 1" 

/db_xref ="GI : 7270704 " 

/translat ion="MVEESASASEASVIQTLTEPATEIAQTLEPNLASIEATVESVVQ 

GGTESTCNNDANNNNAADSAATEVCDEEREKTLEFAEELTEKGSVFLKENDFAEAVDC 

FSRALEIRVAHYGELDAECINAYYRYGLALLAKAQAEADPLGNMPKKEGEVQQESSNG 

ESLAPSVVSGDPERQGSSSGQEGSGGKDQGEDGEDCQDDDLSDADGDADEDESDLDMA 

WKMLDIARVITDKQSTETMEKVDILCSLAEVSLEREDIESSLSDYKNALSILERLVEP 

DSRRTAELNFRICICLETGCQPKEAIPYCQKALLICKARMERLSNEIKGASGSATSST 

VSEIDEGIQQSSNVPYIDKSASDKEVEIGDLAGLAEDLEKKASKLNLSVH" 

6475. .6701 

/gene="AT4g37210" 

/number=l 

6702. .6919 

/gene="AT4g37210" 

/number=2 

6920. ,7012 

/gene="AT4g37210" 

/number=2 

7013. .7225 

/gene="AT4g37210" 

/number=3 

7226. .7312 

/gene="AT4g37210" 

/number=3 

7313. .7403 

/gene="AT4g37210" 

/number =4 

7404. .7488 

/gene="AT4g37210" 

/number =4 

7489. .7792 

/gene="AT4g37210" 

/number=5 

8876. .9739 

/gene="AT4g37220" 

join(8876. .9051,9141. .9248,9330. .9400,9486. .9739) 
/gene="AT4g37220" 

/note="strong similarity to cold acclimation protein 
WCOR413, Triticum aestivum, PATCHX : Gl 657 855 
Contains Prokaryotic membrane lipoprotein lipid attachment 
site AA147-157 

contains EST gb : AW033651 . 1 , W43270, AA650647, AI996990.1, 
AA728669, T42795, Z37671, AI100742, T42949, AA040998, 
AA395771, AA657303, T41871, T45633" 
/codon_start=l 

/product-"cold acclimation protein homolog" 
/protein__id="CAB80388 .1" 
/db_xref ="GI : 7270705" 

/trans lation="MGRGEFLAMKTEENAANLINSDMNEFVAAAKKLVKDVGMLGGVG 
FGTSVLQWAASIFAIYLLILDRTNWKTKMLTTLLVPYIFFTLPSVIFQFFSGDFGKWI 



exon 

intron 

exon 

intron 

exon 

intron 

exon 

gene 
exon 

CDS 



intron 

exon 

exon 

gene 
CDS 



ALIAIIVRLFFPKEFPEWLEIPVALILIVVVSPSLIAWTLRESWVGAVICLVIACYLF 

HEHIKASGGFKNSFTQKNGISNTIGIVALLVYPVWTIFFHIF" 

8876. .9051 

/gene="AT4g37220" 

/number=l 

9052. .9140 

/gene="AT4g37220" 

/number=l 

9141. .9248 

/gene="AT4g37220" 

/number=2 

9249. .9329 

/gene="AT4g37220" 

/number=2 

9330. .9400 

/gene="AT4g37220 ,f 

/number=3 

9401. .9485 

/gene="AT4g37220" 

/number =3 

9486. .9739 

/gene="AT4g37220" 

/number=4 

10194. .11357 

/gene-"AT4g37230" 

10194. .10398 

/gene="AT4g37230" 

/number=l 

join(10194. .10398,11131. .11357) 
/gene="AT4g37230" 

/note="possible frameshift in DNA sequence at pos . 
51188-51200, 5 r part of gene couldn't be reconstructed, 
possible pseudogene, no ATG 

strong similarity to photosystem II oxygen-evolving 
complex protein 1, spinach, PIR2:A23626 
contains EST gb:Z34685" 
/codon_start=l 

/product="photosystem II oxygen-evolving complex like 
protein (partial)" 
/protein_id="CAB80389.1" 
/db_xref="GI: 7270706" 

/t ranslat ion="MTSLTYTLDEIEGPFEVDYAAVTVHNFLVGSVYRSCSRSSSSWH 

RRGYFGPKS I PSAFTQGHVGNKS DQYQGYDNAVALPARGNNEELAKENNKITLS VTKS 

NPESGEVIGAFESIQPSDTDLGATTPKDVKIQGIWYCQLDE" 

10399. .11130 

/gene="AT4g37230" 

/number=l 

11131. .11357 

/gene="AT4g37230" 

/number=2 

18388. .18822 

/gene="AT4g37240" 

/number=l 

18388. .18822 

/gene="AT4g37240" 

18388. .18822 

/gene-"AT4g37240" 



/note="contains EST gb:Z18456" 



gene 
exon 

gene 
CDS 



intron 
exon 

gene 
CDS 



/product="putative protein" 
/protein_id="CAB80390 . 1" 
/db_xref="GI: 7270707" 

/trans la t ion- "MEFANPVKVGYVLLKYPMCFICNSDDMDFDDAVAAISADEELQL 

GQ I Y FALPLCWLRQPLKAEEMAALAVKAS S ALMRGGGGGCRRKC VE P I VS DKLRMRVG 

SGDDTVGSGSGRRKVRNGDGGGSVSSSRRRKCYAAELSTIDE" 

21559. .23955 

/gene="AT4g37250" 

complement (21559. .22253) 

/gene="AT4g37250" ' 

/number=l 

complement (join (21559. .22253, 22350. .23955) ) 
/gene="AT4g37250" 

complement (join (21559. .22253, 22350. .23955) ) 
/gene="AT4g37250" 

/note="similarity to protein kinase TMK1, receptor type 
precursor - Arabidopsis thaliana, PIR1:JQ1674 
contains EST gb:H76836" 
/codon_start=l 

/product="receptor kinase-like protein" 
/protein_id="CAB80391 . 1" 
/db_xref="GI : 7270708" 

/translat ion="MELISVIFFFFCSVLSSSALNSDGLVLMKFKSSVLVDPLSLLQT 
WNYKHESPCSWRGISCNNDSKVLTLSLPNSQLLGSIPSDLGSLLTLQSLDLSNNSFNG 
PLPVSFFNARELRFLDLSSNMISGEIPSAIGDLHNLLTLNLSDNALAGKLPTNLASLR 
NLTVVSLENNYFSGEIPGGWRVVEFLDLSSNLINGSLPPDFGGYSLQYLNVSFNQISG 
EIPPEIGVNFPRNVTVDLSFNNLTGPIPDSPVFLNQESNFFSGNPGLCGEPTRNPCLI 
PSSPSIVSEADVPTSTPAIAAIPNTIGSNPVTDPNSQQTDPNPRTGLRPGVIIGIVVG 
DIAGIGILAVIFLYI YRCKKNKIVDNNNNDKQRTETDTITLSTFSSSSSSPEESRRFR 
KWSCLRKDPETTPSEEEDEDDEDEESGYNANQRSGDNKLVTVDGEKEMEIETLLKASA 
YILGATGSSIMYKAVLEDGRVFAVRRLGENGLSQRRFKDFEPHIRAIGKLVHPNLVRL 
CGFYWGTDEKLVIYDFVPNGSLVNPRYRKGGGSSSPYHLPWETRLKIAKGIARGLAYL 
HEKKHVHGNLKPSNILLGHDMEPKIGDFGLERLLTGETSYIRAGGSSRIFSSKRYTTS 
SREFSSIGPTPSPSPSSVGAMSPYCAPESFRSLKPSPKWDVYGFGVILLELLTGKIVS 
VEEIVLGNGLTVEDGHRAVRMADVAIRGELDGKQEFLLDCFKLGYSCASPVPQKRPTM 
KESLAVLERFHPNSSVIKSSSFHYGH" 



complement (22254 . 
/ number =1 
complement (22350 . 
/gene="AT4g37250" 
/number=2 
34372. .35334 
/gene="AT4g37260" 
34372. .35334 



.22349) 



,23955) 



Query Match 13.7%; Score 59; DB 8; Length 196339; 

Best Local Similarity 52.7%; Pred. No. 9.8e-05; 

Matches 125; Conservative 0; Mismatches 112; Indels 0; 



Gaps 



0; 



Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I I III III I I I I I I I I I I I I I I I I I I I I 
Db 9234 6 GACGGATGCGGAAGCGATTACTATGTTTACAAGCTGATGCCGTTTGGGAATGGCCGGAGA 92405 



Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I 



Db 924 06 ACTTGTCCCGGCGCCGGATTAGGTCAGAGGATTGTGACATTGGCGCTTGGATCGTTGATT 924 65 



Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

I I I I I I I I I I I I I I II I I I I III I I III 

Db 924 66 CAATGCTTTGAATGGGAGAATGTGAAAGGGGAAGAGATGGATATGTCTGAGAGTACTGGG 92525 

Qy 212 ctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatg 2 68 

II I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 92526 TTGGGTATGCGTAAGATGGATCCTTTACGGGCCATGTGTAGGCCTAGGCCCATTATG 92582 



Search completed: February 
Job time: 10060 sec 



7, 2002, 11:08:54 



GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



February 7, 2002, 11:00:03 



: Search time 428.31 Seconds 
(without alignments) 
864.711 Million cell updates/s 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



US-09-394-745-6514 
432 

1 gtccagcagctcggacttac . 



.attttctttttttttcttgg 432 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

930621 seqs, 428662619 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0. 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1861242 



Database : 



N_Gene 
/SI 
/SI 
/SI 
/SI 
/SI 
/SI 
/SI 
/SI 
/SI 

/s 
/s 
/s 
/s 



10 

11 

12 
13 



seq_1101: * 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

DS2/gcgdata 

IDS2/gcgdat 

IDS2/gcgdat 

IDS2/gcgdat 

IDS2/gcgdat 



/genes 
/genes 
/genes 
/genes 
/genes 
/genes 
/genes 
/genes 
/genes 
a/gene 
a/gene 
a/gene 
a/gene 



eq/genes 
eq/genes 
eq/genes 
eq/genes 
eq/genes 
eq/genes 
eq/genes 
eq/genes 
eq/genes 
seq/gene 
seq/gene 
seq/gene 
seq/gene 
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neseq/geneseqn/NA1993 . DAT : * 
neseq/geneseqn/NA1994 . DAT: * ' 
neseq/geneseqn/NA1995 . DAT: * 
neseq/geneseqn/NA1996 . DAT: * 
neseq/geneseqn/NA1997 . DAT: * 
neseq/geneseqn/NA1998 . DAT : * 
neseq/geneseqn/NA1999. DAT: * 
neseq/geneseqn/NA2000 . DAT : * 
neseq/geneseqn/NA2001 . DAT : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
AAV23837 

ID AAV23837 standard; DNA; 622 BP. 
XX 

AC AAV23837; 
XX 

DT 31-JUL-1998 (first entry) 
XX 

DE Plant C4H enzyme DNA sequence. 
XX 

KW Lignin biosynthetic pathway; eucalyptus; pine; transgenic plant; 

KW lignin content; tree processing; cellulose fibre; ss. 

XX ' 

OS Eucalyptus grandis. 

XX 

PN WO9811205-A2. 
XX 

PD 19-MAR-1998. 
XX 

PF 10-SEP-1997; 97WO-NZ00112 . 
XX 

PR ll-SEP-1996; 96US-0713000 . 
XX 

PA (FLET-) FLETCHER CHALLENGE FORESTS LTD. 

PA (GENE-) GENESIS RES & DEV CORP LTD. 
XX 

PI Bloksberg LN, Grierson AW, Havukkala IJ; 
XX 

DR WPI; 1998-207374/18. 
XX 

PT Sequences useful for modification of plant lignin content or 

PT structure - from Eucalyptus grandis (eucalyptus) and Pinus radiata 

PT (pine) are associated with lignin biosynthesis pathway, useful e.g. 

PT in paper industry 

XX 

PS Claim 1; Page 35; 82pp; English. 
XX 

CC This sequence represents a fragment of the C4H enzyme coding sequence. It 

CC is an example of a DNA sequence of the invention, which are from 

CC Eucalyptus grandis (eucalyptus) and Pinus radiata (pine) associated with 

CC the lignin biosynthesis pathway. Constructs containing the DNA sequences 

CC can be used to produce transgenic plants or plant cells, especially woody 

CC plants e.g. eucalyptus or pine species but also e.g. monocotyledons or 

CC dicotyledons; by stably incorporating the constructs into the plant 

CC genome. The lignin content or structure, or activity of a specific enzyme 



CC in the plant, can therefore be modulated. Reductions in lignin content or 

CC changes in composition are useful in tree processing for paper. High 

CC lignin content results in energy- and chemical-intensive separation 

CC methods in order to obtain the pure cellulose fibre required. Reductions 

CC in lignin content may also be useful for forage crops, whilst increases 

CC or changes in composition may be desirable to increase the mechanical 

CC strength of wood, change its colour or increase its resistance to rot. 

CC The sequences are also useful as probes to isolate DNA sequences encoding 

CC enzymes involved in the lignin biosynthesis pathway from other plant 

CC species. 

XX 

SQ Sequence 622 BP; 170 A; 117 C; 178 G;' 157 T; 0 other; 



Query Match 13.8%; Score 59.6; DB 19; Length 622; 

Best Local Similarity 54.1%; Pred. No. 2.6e-08; 

Matches 119; Conservative 0; Mismatches 101; Indels 0; Gaps 0; 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 180 ccgactattgccgtttgggatggggaggagaagttgtcctggtgctggccttgccaatag 239 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatgg 180 

II J I I I I I I I I I I I I ! I I I I I I M I I I I I I I I I I I 
Db 240 agtggtgagcttggtcctggcggcgcttattcagtgcttcgaatgggaacgagttggcga 299 

Qy 181 agctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttgga 240 

II III I I I I I I I I I I I I I I I I I I I I I 

Db 300 agaattggtggacttgtccgaggggacgggactcacaatgccaaagagagagccattgga 359 



Qy 241 ggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

I II I I I I I I I I I I I I I III I I I I I 
Db 360 ggccttgtgcaaagcgcgtgaatgcatgatagctaatgtt 399 



RESULT 2 
AAZ06838 

ID AAZ06838 standard; cDNA; 622 BP. 
XX 

AC AAZ06838; 
XX 

DT 09-NOV-1999 (first entry) 
XX 

DE Eucalyptus cinnamate 4 -hydroxylase (C4H) cDNA. 
XX 

KW Lignin; biosynthesis; forage crop; wood; paper production; 

KW transgenic plant; ss. 

XX 

OS Eucalyptus grandis. . 
XX 

PN US5952486-A. 
XX 

PD 14-SEP-1999. 
XX 

PF 21-NOV-1997; * 97US-097531 6 . 
XX 

PR 21-NOV-1997; 97US-097531 6 . 



PR ll-SEP-1996; 96US-0713000 . 
XX 

PA (FLET-) FLETCHER CHALLENGE FORESTS LTD. 

PA .(GENE-) GENESIS RES & DEV CORP LTD. 
XX 

PI -Bloksberg LN, Grierson AW, Havukkala I; 
XX 

DR WPI; 1999-527029/44. 
XX 

. PT Isolated DNA sequence encoding enzymes from the lignin synthetic 

PT pathway useful for generating plants with an altered lignin content 
XX 

PS Example 1; Columns 31-32; 48pp; English. 
XX 

CC This sequence represents a cinnamate 4-hydroxylase (C4H) 

CC cDNA from Eucalyptus grandis. This enzyme is involved in the 

CC biosynthesis of lignin, an insoluble polymer which is primarily 

CC responsible for the rigidity of plant stems. Lignin serves as a matrix 

CC around the polysaccharide components of some plant cell walls. The 

CC higher the lignin content, the more rigid the plant. Lignin also plays a 

CC role in disease resistance of plants by impeding the penetration and 

CC propagation of pathogenic agents. Lignin is formed by polymerisation of 

CC at least three different monolignols (para-coumaryl alcohol, coniferyl 

CC alcohol and sinapyl alcohol) . These three monolignols are synthesised by 

CC similar pathways from, phenylalanine in a multistep process and are 

CC believed to be polymerised into lignin via a free radical mechanism. 

CC The lignin content of plants can be altered using DNA sequences encoding 

CC these enzymes. Lignin content can be increased by incorporation of 

CC additional copies of genes encoding these enzymes into the target plant. 

CC This could be beneficial for increasing the mechanical strength of wood. 

CC Similarly, a decrease in lignin content can be obtained by transforming 

CC the target plant with antisense copies of such genes. This may be 

CC beneficial in plants used as forage crops for livestock (lignin is 

CC indigestible) and in trees used in paper manufacture. 

XX 

SQ Sequence 622 BP; 170 A; 117 C; 178 G; 157 T; 0 other; 



Query Match 13.8%; Score 59.6; DB 20; Length 622; 

Best Local Similarity 54.1%; Pred. No. 2.6e-08; 

Matches 119; Conservative 0; Mismatches 101; Indels 0; Gaps 0; 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

II II I I I I I I M I II I I I I I I III II II I I I I I I 

Db 180 ccgactattgccgtttgggatggggaggagaagttgtcctggtgctggccttgccaatag 239 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatgg 180 

II I II I I II I I I I I I I I I II I I I I I I I II I I I I I I 
Db 240 agtggtgagcttggtcctggcggcgcttattcagtgcttcgaatgggaacgagttggcga 299 

Qy 181 agctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttgga 240 

II -II I J I I I I I I I I I I I I I I I I I I I I 

Db 300 agaattggtggacttgtccgaggggacgggactcacaatgccaaagagagagccattgga 359 

Qy 241 ggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

I II I I I I I I I I I I I I I III II III 
Db 360 ggccttgtgcaaagcgcgtgaatgcatgatagctaatgtt 399 



RESULT 3 
AAA67924 

ID AAA67924 standard; DNA; 622 BP. 
XX 

AC AAA67 924; 
XX 

DT 24-OCT-2000 (first entry) 
XX 

DE Eucalyptus grandis C4H nucleotide sequence SEQ ID NO: 17. 
XX 

KW Plant; lignin; lignin biosynthetic pathway; Eucalyptus grandis; 

KW Pinus radiata; Monterey pine; ds . 

XX 

OS Eucalyptus grandis. 
XX 

PN WO200022099-A1. 
XX 

PD 20-APR-2000. 
XX 

PF 06-OCT-1999; 99WO-NZ001 68 . 
XX 

PR 09-OCT-1998; 98US-01 6978 9 . 

PR 14-JUL-1999; 99US-01 4 381 1 . 
XX 

PA (GENE- ) GENESIS RES & DEV CORP LTD. 

■PA (FLET-) FLETCHER CHALLENGE FORESTS LTD. 

XX 

PI Bloksberg LN, Havukkala IJ; 
XX 

DR WPI; 2000-317962/27. 
XX 

PT Novel polynucleotide encoding enzymes involved in lignin-biosynthetic 

PT pathway useful for producing transgenic plants especially eucalyptus 

PT and pine species having altered lignin content, composition and 

PT structure 
XX 

PS Example 1; Page 61-62; 213pp; English. 
XX 

CC The present invention describes isolated polynucleotides and proteins 

CC encoding and representing the enzymes cinnamate 4-hydroxylase (C4H) , 

CC coumarate 3-hydroxylase (C3H) , phenolase (PNL) , O-methyl transferase 

CC (OMT) , cinnamyl alcohol dehydrogenase (CAD) , cinnamoyl-CoA reductase 

CC (CCR) , phenylalanine ammonia- lyase (PAL), 4 -coumarate : CoA ligase (4CL) , 

CC coniferol glucosyl transferase (CGT) , coniferin beta-glucosidase (CBG) , 

CC laccase, peroxidase, f erulate-5-hydroxylase (F5H) , alpha-amylase, 

CC caffeic acid methyl transferase, caffeoyl CoA methyl transferase, 

CC coumerate CoA ligase, cytochrome P450 LXX1A, diphenol oxidase, flavanol 

CC glucosyl transferase, flavenoid hydroxylase, and isoflavone reductase, 

CC which are involved in the lignin biosynthetic pathway. The 

CC polynucleotides can be used for modulating lignin content, lignin 

CC composition and the structure of a plant, especially eucalyptus and pine 

CC species, and for modifying the activity of an enzyme involved in lignin 

CC biosynthetic pathway, and for producing a plant having altered lignin 

CC content, composition and structure. They can be used for designing probes 

CC and primers useful for detecting similar DNA and RNA sequences in any 



CC organism and for PGR amplification. The lignin content can be efficiently 

CC modified using the polynucleotides. AAA67908 to AAA68201 and AAB16341 to 

CC AAB1644 9 represent polynucleotide and protein sequences used in the 

CC exemplification of the present invention. 
XX 

SQ Sequence 622 BP; 170 A; 117 C; 178 G; 157 T; 0 other; 



Query Match , 13.8%; Score 59.6; DB 21; Length 622; 

Best Local Similarity 54.1%; Pred. No. 2.6e-08; 

Matches 119; Conservative 0; Mismatches 101; Indels 0; Gaps 0; 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

II II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 180 ccgactattgccgtttgggatggggaggagaagttgtcctggtgctggccttgccaatag 239 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatgg 180 

II I I I I I I I I I I II I I I I I I I I I I I I I I I I MM I 
Db 240 agtggtgagcttggtcctggcggcgcttattcagtgcttcgaatgggaacgagttggcga 299 

Qy 181 agctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttgga 240 

II III I II II I I I I II I I I II I M II 

Db 300 agaattggtggacttgtccgaggggacgggactcacaatgccaaagagagagccattgga 359 

Qy 241 ggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 ' 

I I I I I I I I I I I I II I I III II III 
Db 360 ggccttgtgcaaagcgcgtgaatgcatgatagctaatgtt 399 



RESULT 4 
AAC47389 

ID AAC47389 standard; 
XX 
AC 
XX 
DT. 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 

PR 25-FEB-1999 
PR 05-MAR-1999 
PR 09-MAR-1999 
PR 23-MAR-1999 
PR 25-MAR-1999 
PR 29-MAR-1999 



DNA; 1655 BP. 

AAC47389; 

18-OCT-2000 (first entry) 

Arabidopsis thaliana DNA fragment SEQ ID NO: 53631. 

Hybridisation assay; genetic mapping; gene expression control; 
protein identification; signal transduction pathway; 
metabolic pathway; promoter; termination sequence; ss . 

Arabidopsis thaliana. 

EP1033405-A2. 

06-SEP-2000. 

25-FEB-2000; 2000EP-03014 39 . 



99US-0121825. 
99US-0123180. 
99US-0123548. 
99US-0125788. 
99US-0126264. 
99US-0126785. 
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99US-0161992 . 
99US-0161993. 
99US-0162142. 



Query Match 13.7%; Score 59; DB 21; Length 1655; 

Best Local Similarity 52.7%; Pred. No. 6e-08; 

Matches 125; Conservative 0; Mismatches 112; Indels 0; Gaps 0; 

Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I I III III I I II I I I I I I I I I I I I I I I I 
Db 1308 gacggatgcggaa'gcgattactatgtttacaagctgatgccgtttgggaatggccggaga 13 67 

Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I I I I I I I I I II I ! M I II I I I I I I I I II I I I 

Db 1368 acttgtcccggcgccggattaggtcagaggattgtgacattggcgcttggatcgttgatt 1427 

Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

I I I I I I I I I I I I I I II I I I I III I I III 

Db 1428 caatgctttgaatgggagaatgtgaaaggggaagagatggatatgtctgagagtactggg 14 87 

Qy 212 ctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatg 2 68 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1488 ttgggtatgcgtaagatggatcctttacgggccatgtgtaggcctaggcccattatg 1544 



RESULT 
AAC37476 
ID 
XX 



AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 



AAC37476 standard; DNA; 1656 BP. 
AAC37476; 



17-OCT-2000 (first entry) 

Arabidopsis thaliana DNA fragment SEQ ID NO: 17517. 

Hybridisation assay; genetic mapping; gene expression control; 
protein identification; signal transduction pathway; 
metabolic pathway; promoter; termination sequence; ss. 

Arabidopsis thaliana . 

EP1033405-A2. 



06-SEP-2000. 



25-FEB-2000; 2000EP-03014 39 . 



25-FEB-1999 

05- MAR-1999 
09-MAR-1999 
23-MAR-1999 
25-MAR-1999 
29-MAR-1999 
01-APR-1999 

06- APR-1999 
08-APR-1999 
16-APR-1999 



99US-0121825. 
99US-0123180. 
99US-0123548. 
99US-0125788. 
99US-0126264 . 
99US-0126785. 
99US-0127462. 
99US-0128234 . 
99US-0128714 . 
99US-0129845. 



PR 


19 


-APR- 


1999 


PR 


21 


-APR- 


1999 


PR 


23 


-APR- 


1999 


PR 


23 


-APR- 


1999 


PR 


28 


-APR- 


1999 


PR 


30 


-APR- 


i n n a 

1999 


PR 


30 


-APR- 


1999 


PR 


04 


-MAY- 


1999 


PR 


05 


-MAY- 


1999 


PR 


06 


-MAY- 


1999 


PR 


06 


-MAY- 


1999 


PR 


07 


-MAY- 


1999 


PR 


11 


-MAY- 


1999 


PR 


14 


-MAY- 


1999 


PR 


14 


-MAY- 


1999 


PR 


14 


-MAY- 


1999 


PR 


14 


-MAY- 


1999 


PR 


18 


-MAY- 


1999 


PR 


19 


-MAY- 


1999 


PR 


20 


-MAY- 


1999 


PR 


21 


-MAY- 


1999 


PR 


24 


-MAY- 


1999 


PR 


25 


-MAY- 


1999 


PR 


27 


-MAY- 


1999 


PR 


28 


-MAY- 


1999 


PR 


01 


- JUN- 


1999 


PR 


03 


- JUN- 


1999 


PR 


04 


-JUN- 


1999 


PR 


07 


- JUN- 


1999 


PR 


08 


- JUN- 


1999 


PR 


10 


- JUN- 


1999 


PR 


10 


- JUN- 


1999 


PR 


14 


- JUN- 


1999 


PR 


16 


- JUN- 


1999 


PR 


16 


- JUN- 


1999 


PR 


17 


- JUN- 


1999 


PR 


18 


- JUN- 


1999 


PR 


18 


- JUN- 


1999 


PR 


18 


- JUN- 


1999 


PR 


18 


-JUN- 


1999 


PR 


18 


- JUN- 


1999 


PR 


18 


- JUN- 


1999 


PR 


18 


- JUN- 


1999 


PR 


18 


-JUN- 


1999 


PR 


18 


-JUN- 


1999 


PR 


18 


-JUN- 


1999 


PR 


18 


-JUN- 


1999 


PR 


18 


-JUN- 


1999 


PR 


21 


- JUN- 


1999 


PR 


22 


- JUN- 


1999 


PR 


23 


-JUN- 


1999 


PR 


23 


-JUN- 


1999 


PR 


24 


-JUN- 


1999 


PR 


28 


-JUN- 


1999 


PR 


29 


-JUN- 


1999 


PR 


30 


-JUN- 


1999 


PR 


01 


-JUL- 


1999 



99US-0130077 
99US-0130449 
99US-0130510 
99US-0130891 
99US-0131449 
99US-0132048 
99US-0132407 
99US-0132484 
99US-0132485 
99US-0132486 
99US-0132487 
99US-0132863 
99US-0134256 
99US-0134218 
99US-0134219 
99US-0134221 
99US-0134370 
99US-0134768 
99US-0134941 
99US-0135124 
99US-0135353 
99US-0135629 
99US-0136021 
99US-0136392 
99US-0136782 
99US-0137222 
99US-0137528 
99US-0137502 
99US-0137724 
99US-0138094 
99US-0138540 
99US-0138847 
99US-0139119 
99US-0139452 
99US-0139453 
99US-0139492 
99US-0139454 
99US-0139455 
99US-0139456 
99US-0139457 
99US-0139458 
99US-0139459 
99US-0139460 
99US-0139461 
99US-0139462 
99US-0139463 
99US-0139750 
99US-0139763 
99US-0139817 
99US-0139899 
99US-0140353 
99US-0140354 
99US-0140695 
99US-0140823 
99US-0140991 
99US-0141287 
99US-0141842 



PR 


01 


- JUL- 


1999 


PR 


02 


- JUL- 


1999 


PR 


06 


- JUL- 


1999 


PR 


a o 

08 


- JUL- 


199 9 


PR 


09 


- JUL- 


t n n r\ 

1999 


PR 


1Z 


- JUL- 


i n n n 

i y y y 


PR 


1 O 


- J UL- 


1 o n n 

i y y y 


PR 


1 A 

14 


- JUL- 


1999 


PR 


15 


- JUL- 


1999 


PR 


16 


-JUL- 


1999 


PR 


16 


- JUL- 


*i nan 

1999 


PR 


19 


- JUL- 


1999 


PR 


19 


- JUL- 


1999 


PR 


19 


- JUL- 


1999 


PR 


19 


- JUL- 


1999 


PR 


t a 

19 


- JUL- 


T Cl A A 

1999 


PR 


i a 
19 


- JUL- 


1 AAA 

1999 


PR 


20 


- JUL- 


1999 


PR 


O A 

20 


- JUL- 


1 AAA 

1999 


PR 


O A 

20 


- JUL- 


1 AAA 

1999 


PR 


21 


- JUL- 


1999 


PR 


21 


-JUL- 


1999 


PR 


21 


-JUL- 


1999 


PR 


22 


-JUL- 


1999 


PR 


22 


-JUL- 


1999 


PR 


22 


-JUL- 


1999 


PR 


22 


- JUL- 


1999 


PR 


23 


- JUL- 


1999 


PR 


23 


- JUL- 


1999 


PR 


23 


- JUL- 


1999 


PR 


a r 


- JUL- 


1999 


PR 


I I 


- JUL- 


1 A A A 

1999 


PR 


27 


- JUL- 


1999 


PR 


27 


- JUL- 


1999 


PR 


28 


- JUL- 


1999 


PR 


02 


-AUG- 


1999 


PR 


02 


-AUG- 


1999 


PR 


02 


-AUG- 


1999 


PR 


03 


-AUG- 


1999 


PR 


04 


-AUG- 


1999 


PR 


04 


-AUG- 


1999 


PR 


05 


-AUG- 


1999 


PR 


05 


-AUG- 


1999 


PR 


06 


-AUG- 


1999 


PR 


06 


-AUG- 


1999 


PR 


09 


-AUG- 


1999 


PR 


09 


-AUG- 


1999 


PR 


10 


-AUG- 


1999 


PR 


11 


-AUG- 


1999 


PR 


12 


-AUG- 


1999 


PR 


13 


-AUG- 


1999 


PR 


13 


-AUG- 


1999 


PR 


16 


-AUG- 


1999 


PR 


17 


-AUG- 


1999 


PR 


18 


-AUG- 


1999 


PR 


20 


-AUG- 


1999 


PR 


20 


-AUG- 


1999 



99US-0142154 
99US-0142055 
99US-0142390 
99US-0142803 
99US-0142920 
99US-0142977 
99US-0143542 
99US-0143624 
99US-0144005 
99US-0144085 
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Query Match 13.4%; Score 58; DB 21; Length 1656; 

Best Local Similarity 52.5%; Pred. No. 1.2e-07; 

Matches 124; Conservative 0; Mismatches 112; Indels 0; Gaps 



0; 



Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I I II I III I I I I I I I I I I I I I I I I I I I I 
Db 1308 gacggatgcggaagcgattactatgtttacaagctgatgccgtttgggaatggccggaga 1367 

Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I 

Db 1368 acttgtcccggcgccggattaggtcagaggattgtgacattggcgcttggatcgttgatt 1427 

Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

I I I I I I I I I I I I I I II I I I I III I I III 

Db 14 28 caatgctttgaatgggagaatgtgaaaggggaagagatggatatgtctgagagtactggg 14 87 

Qy 212 ctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctat 267 

II I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 14 88 ttgggtatgcgtaagatggatcctttacgggccatgtgtaggcctaggcccattat 154 3 



RESULT 6 
AAX58406 

ID AAX58406 standard; cDNA; 1674 BP. 
XX 

AC AAX58406; 
XX 

DT 02-AUG-1999 (first entry) 
XX 

DE Jerusalem artichoke in-chain hydroxylase CYP81B1 clone D. 
XX 

KW In-chain hydroxylase; transgenic plant; lipid; hydroxylation; 

KW oilseed; vegetable oil; crop protection; Jerusalem artichoke; 

KW CYP81B1; cytochrome P450; ss . 
XX 

OS Helianthus tuberosus . 
XX 

FH Key Location/Qualifiers 

FT CDS 14.. 1531 

FT /*tag= a 
XX 

PN W09918224-A1. 
XX 

PD 15-APR-1999. 
XX 

PF 06-OCT-1998; 98WO-IB0171 6 . 
XX 

PR 06-OCT-1997; 97US-00 60960 . 
XX 

PA (CNRS ) CENT NAT RECH SCI. 
XX 

PI Batard Y, Benveniste I, Cabello-Huartado F, Durst F; ■ 

PI Helvig C, Le Bouquin R, Pinot F, Salaun J, Tijet N; 

PI Werck-Reichhart D; 
XX 

DR WPI; 1999-264030/22. 

DR P-PSDB; AAY05902. 



XX 

PT Nucleic acid encoding plant fatty acid hydroxylases 
XX 

PS Example 4; Fig 20A-B; 157pp; English. 
XX 

CC This is the DNA sequence of clone D encoding in-chain hydroxylase 

CC CYP81B1 (see AAY05902) of Jerusalem artichoke. Clone D was isolated 

CC from a tuber tissue cDNA library by PCR amplification. CYP81B1 

CC is a microsomal cytochrome P450 that catalyses the omega-2, omega-3 

CC and omega-4 hydroxylation of capric, lauric and myristic acids. 

CC The major metabolite is the omega-3-hydroxylated compound. The 

CC invention provides isolated nucleic acids (see AAX58400-06) encoding' 

CC plant fatty acid hydroxylases (see AAY05896-902 ) . Also claimed are 

CC host cells, transgenic plants and compositions consisting of the 

CC plant fatty acid hydroxylase, a process for isolating additional 

CC fatty acid hydroxylase genes from a plant, and a process of 

CC altering fatty acid composition in a plant by expressing the plant 

CC fatty acid hydroxylase in a transgenic plant, and hydroxylating or 

CC epoxidating a fatty acid substrate in the plant. Manipulating the 

CC hydroxylated fatty acid content of plants will modify resistance to 

CC drought and attack by insects and other pests. The transgenic 

CC plants may also be used as sources of hydroxylated and epoxidized 

CC fatty acids useful in the manufacture of e.g. lubricants, anti-slip. 

.CC agents, plasticisers , coating agents, detergents and surfactants. 
XX 

SQ Sequence 1674 BP; 427 A; 364 C; 404 G; 479 T; 0 other; 



Query Match 13.1%; Score 56.6; DB 20; Length 1674; 

Best Local Similarity 49.8%; Pred. No. 3.2e-07; 

Matches 140; Conservative 0; Mismatches 141; Indels 0; Gaps 

Qy 12 cggacttacccggccggttcgatggctccggcggcaaggccaaggggcccctgctgatcc 71 

II I I I I I MM II 111 I III I I I I I I II 

Db 1251 cgttcaaaccagaaaggtttgaagggttagaagggacacgggatgggtttaagttattgc 1310 

Qy 72 ctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcgggc 131 

I I I I I I II II I I I I I I I I M I I I I I I I II I I III 

Db 1311 catttgggtctggaaggaggagttgtcctggggaaggcttggcggttcgaatgcttggga 1370 

Qy 132 tggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatggagctcaggttt 191 

II II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1371 tgactttagggtcaattattcaatgcttcgattgggaacgaacgagtgaagagttggttg 1430 

Qy 192 gacatgaagctancggcgggctgaccatgccccgggccgtcccgttggaggccatgtgca 251 

I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I 

Db 1431 atatgactgaaggtcctgggctaaccatgcctaaggctataccattggtagctaagtgca 1490 

Qy 252 ngccgcgtacagctatgcgtggtgttcttaagaggctctga 292 

I I I I I II I I I I I I I I I I 

Db 14 91 aacctcgggttgagatgacgaatctactgtccgaactgtga 1531 



RESULT 7 
AAA29326 

ID AAA29326 standard; cDNA; 1859 BP. 
XX 



AC AAA29326; 
XX 

DT 26-SEP-2000 (first entry) 
XX 

DE Soybean isof lavone-2-hydroxylase coding sequence. 
XX 

KW Soybean; isof lavone-2-hydroxylase ; flavonol; biosynthesis; anthocyanin; 

KW flower colour; pollen tube; feeding deterrent; UV irradiation; ss. 

XX 

OS Glycine max. 
XX 

FH Key Location/Qualifiers 

FT CDS 59.. 1561 

FT /*tag= a 

FT /product= Isof lavone-2-hydroxylase 

XX 

PN WO200037652-A2. 
XX 

PD 29-JUN-2000. 
XX 

PF 20-DEC-1999; 99WO-US30337 . 
XX 

PR 21-DEC-1998; 98US-01131 90 . 
XX 

PA (DUPO ) DU PONT DE NEMOURS & CO E I . 
XX 

PI Famodu 00, McGonigle B, Odell JT, Fader GM, Falco SC; 
XX 

DR WPI; 2000-442678/38. 

DR P-PSDB; AAY96593. 
XX 

PT New polynucleotide encoding flavonoid biosynthetic enzymes, useful for 

PT producing transgenic plants and immunological screening of cDNA 

PT libraries 
XX 

PS Claim 1; Page 30; 36pp; English. 
XX 

CC This cDNA, isolated from clone slslc .pk005 . n3, encodes a plant (soybean) 

CC isof lavone-2-hydroxylase . It was determined using the sequence of an 

CC isof lavone-2-hydroxylase encoded by a contig composed of clones 

CC sgclc.pk001.gl7, sgs2c . pk004 . h7 and slf 1 . pk0034 . gl . The cDNA sequences 

CC can be used for the recombinant production of the enzyme, to isolate 

CC homologues, to create transgenic plants and to provide probes for 

CC genetically and physically mapping genes and as markers for traits linked 

CC to the genes. The proteins can be used for immunological screening, in 

CC particular to raise antibodies against the enzymes. The enzyme and its 

CC gene are useful to study flavonol biosynthesis in plants and provide 

CC means to enhance or otherwise alter flavonol and anthocyanin 

CC biosynthesis. Flavonoids have diverse functions, such as co-pigments in 

CC flower colour, stimulation of pollen tube growth, pollinator attraction, 

CC and feeding deterrents and protection against UV irradiation in fruits 

CC and seeds. 

XX 

SQ Sequence 1859 BP; 536 A; 402 C; 417 G; 504 T; 0 other; 



Query Match 



12.4%; Score 53.4; DB 21; Length 1859; 



Best Local Similarity 53.1%; Pred. No. 3.1e-06; 

Matches 111; Conservative 0; Mismatches 98; Indels 0; Gaps 



Qy 54 aggggcccctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcg 113 

I I I I I' I I I ' I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1329 aggagaaaaagttggtagcatttggcatgggaagaagggcttgcccaggagaacccatgg 1388 

Qy 114 cgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacgg 17 3 

I III I I I I I I III I I I I I II I I I I I I I I I I 

Db 1389 ctatgcaaagtgtcagctttactttgggattgttgattcaatgttttgactggaaacgag 1448 

Qy 17 4 ttgatggagctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcc 233 

I Ml I I II II I I I I I I I I I I I I I II 

Db 144 9 taagtgaggaaaagcttgatatgacagagaacaattggatcaccttgtcaaggttaattc 1508 

Qy 234 cgttggaggccatgtgcangccgcgtaca 262 

I I I I I I I I II I I I I I I I I I I I II 
Db 1509 cattggaggccatgtgcaaggctcgccca 1537 



RESULT 8 
AAQ50511 

ID AAQ50511 standard; cDNA; 1817 BP. 
XX 

AC AAQ50511; 
XX 

DT 17-MAY-1994 {first entry) 
XX 

DE Bxl gene. 
XX 

KW Bxl; resistance; plant; benzoxazine; biosynthesis; allele; 

KW European corn borer; pest; vector; clone; ds . 

XX 

OS Zea mays . 
XX 

FH Key Location/Qualifiers 
FT CDS 78.. 1670 

FT /*tag= a 

FT /product= Bxl_gene_product 

XX 

PN W09322441-A. 
XX 

PD ll-NOV-1993. 
XX 

PF 23-APR-1992; 92WO-EP00905 . 
XX 

PR 23-APR-1992; 92WO-EP00905 . 
XX 

PA (PLAC ) MAX PLANCK GES FOERDERUNG WISSENSCHAFTEN . 
XX 

PI Frey M, Gierl A, Peterson PA, Saedler H, Sommer H; 
XX 

DR WPI; 1993-368800/46. 
DR P-PSDB; AAR43024. 
XX 

PT DNA sequence of Bxl gene - used to confer resistance on plants 
PT with low or no levels of benzoxazine ( s ) 



XX 

PS Claim 1; Fig 1; 28pp; English. 
XX 

CC The sequence encodes a protein involved in the biosynthesis of 

CC benzoxazines, which are used by plants as a poison / deterrent 

CC on insects and microorganisms. The protein can be expressed 

CC in transformed plants, enhancing their ability to combat infection. 

XX 

SQ Sequence 1817 BP; 353 A; 607 C; 548 G; 309 T; 0 other; 



Query Match 11.9%; Score 51.6; DB 14; Length 1817; 

Best Local Similarity 58.4%; Pred. No. l.le-05; 

Matches 90; Conservative 0; Mismatches 64; Indels 0; Gaps 

Qy 19 acccggccggttcgatggctccggcggcaaggccaaggggcccctgctgatccctttcgg 78 

I I Mill II I III I I I I I I I I I I I I I I I I I I I 
Db 1409 acaaggccgcgacgccgaggtcgacatgtacggcaaggacatccggttcgtgccgttcgg 1468 

Qy 79 gatggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgct 138 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1469 ggctgggcgcaggatctgcgcgggggccacgttcgccatcgccaccgtcgagatcatgct 1528 

Qy 139 cgcaacgctgctcaattgcttcgactgggacacg 172 

I I I I II III I I I I I I I I I I I I I 
Db 1529 cgcgaacctcatctaccatttcgactgggagatg 1562 

RESULT 9 
AAZ50024 

ID AAZ50024 standard; cDNA; 1847 BP. 
XX 

AC AAZ50024; 
XX 

DT 25-APR-2000 (first entry) 
XX 

DE Maize cytochrome p450 monooxygenase; CYP71C3v2 cDNA. 
XX 

•KW Cytochrome p4 50 monooxygenase; CYP71C3v2; maize; chromosome 4p; weed; 
KW p450 gene; molecular dioxygen; herbicidal; pigweed; transgenic organism 
KW herbicide resistant; triasulf uron; quack grass; velvet leaf; 
KW labs quarter; Chenopodium album; ss. 
XX 

OS Zea mays. 
XX 

FH Key Location/Qualifiers 

FT 5'UTR 1..6 

FT /*tag= a 

FT CDS 7.. 1611. 

FT . /*tag= b 

FT /product= "Maize cytochrome p450 monooxygenase 

FT CYP71C3v2" misc_feature 540.. 541 
FT /*tag= c 

FT /note= "intron 1 (AAZ50025) is located between these 

FT nucleotides" 
FT misc_feature 981.. 982 
FT /*tag= d 



FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 
XX 
PI 
XX 
DR 
DR 
XX 
PT 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
'cc 
cc 
cc 
cc 

XX 
SQ 



/note= "intron 2 (AAZ50026) is located between these 

nucleotides" 
1678. .1683 
/*tag= e 
1700. .1709 
/*tag= f 
1728. .1733 
/*tag= g 
1763. . 1768 
/*tag= h 
1806. .1811 
/*tag= i 
1762 

/*tag= j 

/note= "putative polyadenylation site" 
1799 

/*tag= k. 

/note= "putative polyadenylation site" 
1833 

/*tag= 1 

/note= "putative polyadenylation site" 



polyA_signal 
polyA_signal • 
polyA_signal 
polyA_signal 
polyA_signal 
polyA_site 

polyA_site 

polyA_site 

WO200000502-A1. 
06-JAN-2000. 

23-JUN-1999; 99WO-US14 117 . 

26-JUN-1998; 98US-0090759 . 

(UNII ) UNIV ILLINOIS FOUND. 

Schuler MA, Persans MW; 

WPI; 2000-170902/15. 
P-PSDB; AAY44726. 

Novel maize cytochrome P450 monooxygenase polypeptides and 
polynucleotides, used to confer triasulfuron herbicide resistance to 
plants 

Claim 4; Page 46-48; 77pp; English. 

The present sequence is the cDNA encoding maize cytochrome p450 
monooxygenase, CYP71C3v2. CYP71C3v2 gene is mapped to a single locus on 
the short arm of maize chromosome 4 (4p) and has two introns. It is 
encoded by a single copy or a small number of closely linked p450 genes. 
CYP71C3v2 reductively cleaves molecular dioxygen to produce 
f unct ionalised organic substrates. It has herbicidal activity. 
CYP71C3v2 polynucleotides are used to produce transgenic organisms, such 
as yeast, plants and bacteria that are resistant to herbicides, such as 
triasulf urons . Undesired vegetation, e.g. weed, pigweed, velvet leaf, 
labs quarters, Chenopodium album and quack grass, can easily be 
controlled when such transgenic plants are grown. Transformed organisms 
can also be used to identify compounds with herbicidal activity. 

Sequence 1847 BP; 386 A; 576 C; 555 G; 330 T; 0 other; 



Query Match 11.4%; Score 49.4; DB 21; 

Best Local Similarity 58.5%; Pred. No. 5e-05; 
Matches 86; Conservative 0; Mismatches 61; 



Length 1847; 

Indels 0; Gaps 0 



Qy 


22 


Db 


1356 


Qy 


82 


Db 


1416 


Qy 


142 


Db 


1476 



1 1 1 1 1 1 II 



1 1 1 1 1 I I 



I I 1 1 II I I 1 1 1 



I I 



I II 1 1 1 1 1 1 



I I 



1 1 1 1 1 1 1 II I 



1 1 1 1 1 1 1 1 1 



1 1 1 1 1 1 



Mil l I I I 



I I I I I I 



AAZ87320; 

22-MAY-2000 (first entry) 

Maize cytochrome P450 monooxygenase CYP71C3v2 full-length cDNA. 

Cytochrome P450 monooxygenase; CYP71C3v2; herbicide detoxification; 
triasulf uron; transgenic plant; herbicide identification; ss. 



Zea mays . 

Key 
CDS 



RESULT 10 
AAZ87320 

ID AAZ87320 standard; cDNA; 1848 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 



exon 



exon 



exon 

WO200000585-A2. 
06-JAN-2000. 



Location/Qualifiers 
7. .1611 
/*tag= a 

/product= "Maize cytochrome P450 monooxygenase, 
CYP71C3v2" 

1..540 
/*tag= b 
/number= 1 

/note= "In genomic DNA, intron 1 (AAZ87321) lies between 

exons 1 and 2" 
541. .981 ' 
/*tag= c 
/number = 2 

/note= "In genomic DNA, intron 2 (AAZ87322) lies between 

exons 2 and 3" 
982. .1847 
/*tag= c 
/number- 3 



PF 28-JUN-1999; 99WO-US1 4 689 . 
XX 

PR 26-JUN-1998; 98US-0090759 . 
XX 

PA (UNII ) UNIV ILLINOIS FOUND. 
XX 

PI Schuler MA, Persans MW; 
XX 

DR WPI; 2000-170909/15. 

DR P-PSDB; AAY77232. 
XX 

PT Novel maize cytochrome P450 monooxygenase cDNA used to confer herbicide 

PT resistance to plants 

XX 

PS Claim 2; Fig 1; 85pp; English. 
XX . 

CC The present sequence represents a full-length cDNA encoding maize 

CC cytochrome P450 monooxygenase CYP71C3v2. cDNA was generated via reverse 

CC transcriptase-PCR (RT-PCR) from poly (A) + mRNA isolated from naphthalic 

CC anhydride and herbicide (triasulf uron) -treated maize seedlings. This was 

CC used to construct a cDNA library, which was screened using previously 

CC generated cDNA as hybridisation probes. The CYP71C3v2 cDNA clone was 

CC extended via. 5 ! RACE (rapid amplification of cDNA ends) and cloned into 

CC pBluescript. Genomic DNA was also screened for clones encoding 

CC CYP71C3v2 - this was found to contain 2 introns (AAZ87321-Z87322 ) . 

CC Cytochrome P450 monooxygenase CYP71C3v2 reductively cleaves molecular 

CC dioxygen to produce f unctionalised organic substrates. Nucleotides 

CC encoding cytochrome P450 monooxygenase CYP71C3v2 are used to produce 

CC transgenic plants with increased resistance to herbicides, such as 

CC triasulf uron . When such transgenic plants are grown, undesired 

CC vegetation such as pigweed, velvet leaf, lambs quarters, Chenopodium 

CC album and quack grass, can easily be controlled. The methods may also be 

CC used to identify those compounds with herbicidal activity. 

XX 

SQ Sequence 1848 BP; 387 A; 577 C; 555 G;. 32 9 T; 0 other; 



Query Match 11.4%; Score 49.4; DB 21; Length 1848; 

Best Local Similarity 58.5%; Pred. No. 5e-05; 

Matches 8 6; Conservative 0; Mismatches 61; Indels 0; Gaps 0; 

Qy 22 cggccggttcgatggctccggcggcaaggccaaggggcccctgctgatccctttcgggat 81 

I I I I I I II I III II I I I I I I I I I I I I I I I I 
Db 1356 cggctgggacaagtccaacagctacagcggccaggacttcaggtacctgccgttcgggtc 1415 

Qy 82 ggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgc 141 

I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1416 tgggcgccggatctgccccggggccaacttcgcgctcgcgaccatggagatcatgctcgc 1475 



Qy 142 aacgctgctcaattgcttcgactggga 168 

I II I I I I I I I I I I I I I 

Db 1476 caacctcatgtaccatttcgactggga 1502 



RESULT 11 
AAC42545 

ID AAC42545 standard; DNA; 1545 BP. 



XX 

AC AAC42545; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Arabidopsis thaliana DNA fragment SEQ ID NO: 35961. 
XX 

KW Hybridisation assay; genetic mapping; gene expression control; 

KW protein identification; signal transduction pathway; 

KW metabolic pathway; promoter; termination sequence; ss. 
XX 

OS Arabidopsis thaliana. 
XX 

PN EP1033405-A2 . 
XX 

PD 06-SEP-2000. 
XX 

PF 25-FEB-2000; 2000EP-0301439 . 
XX 

PR 25-FEB-1999; 99US-0121825 . 

PR 05-MAR-1999; 99US-0123180 . 

PR 09-MAR-1999; 99US-012354 8 . 

PR 23-MAR-1999; 99US-0125788 . 

PR 25-MAR-1999; 99US-0126264 . 

PR 29-MAR-1999; 99US-0126785 . 

PR 01-APR-1999; 99US-01274 62 . 

PR 06-APR-1999; 99US-0128234 . 

PR 08-APR-1999; 99US-0128714 . 

PR 16-APR-1999; 99US-012 984 5 . 

PR 19-APR-1999; 99US-0130077 . 

PR 21-APR-1999; 99US-01304 4 9 . 

PR 23-APR-1999; 99US-0130510 . 

PR 23-APR-1999; 99US-0130891 . 

PR 28-APR-1999; 99US-0131449 . . 

PR 30-APR-1999; 99US-013204 8 . 

PR 30-APR-1999; 99US-0132407 . 

PR 04-MAY-1999; 99US-0132484 . 

PR 05-MAY-1999; 99US-0132485 . 

PR 06-MAY-1999; 99US-0132486 . 

PR 06-MAY-1999; 99US-0132487 . 

PR 07-MAY-1999; 99US-0132863 . 

PR ll-MAY-1999; 99US-0134256 . 

PR 14-MAY-1999; 99US-0134218 . 

PR 14-MAY-1999; 99US-0134219 . 

PR 14-MAY-1999; 99US-0134221 . 

PR 14-MAY-1999; ■ 99US-0134 370 . 

PR 18-MAY-1999; 99US-0134768 . 

PR 19-MAY-1999; 99US-0134 941 . 

PR 20-MAY-1999; 99US-0135124 . 

PR 21-MAY-1999; 99US-0135353 . 

PR 24-MAY-1999; 99US-013562 9 . 

PR 25-MAY-1999; 99US-0136021 . 

PR 27-MAY-1999; 99US-0136392 . 

PR 28-MAY-1999; 99US-0136782 . 

PR 01-JUN-1999; 99US-0137222 . 

PR 03-JUN-1999; 99US-0137528 . 

PR 04-JUN-1999; 99US-0137502 . 
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99US-0139460 
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99US-0139750 
99US-0139763 
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99US-0140695 
99US-0140823 
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99US-0141287 
99US-0141842 
99US-01421-54 
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PR 27-AUG-1999, 

PR 30-AUG-1999, 

PR 31-AUG-1999, 

PR Ol-SEP-1999, 

PR 07-SEP-1999, 

PR 10-SEP-1999, 

PR 13-SEP-1999, 

PR 15-SEP-1999, 

PR 16-SEP-1999, 

PR 20-SEP-1999, 

PR 22-SEP-1999, 

PR 23-SEP-1999, 

PR 24-SEP-1999, 

PR 28-SEP-1999, 

PR 29-SEP-1999, 

PR 04-OCT-1999, 

PR 05-OCT-1999, 

PR 06-OCT-1999, 

PR 07-OCT-1999, 

PR 08-OCT-1999, 

PR 12-OCT-1999, 
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Query Match 11.3%; Score 48.8; DB 21; Length 1545; 

Best Local Similarity 52.5%; Pred. No..7e-05; 

Matches 104; Conservative 0; Mismatches 94; Indels 0; Gaps 0; 

Qy 64 gctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcggac 123 

E I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1326 gctaatctcatttgggatgggacgaagagcttgtcctggagccgggctagctcatcggct 1385 

Qy 124 cgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatggagc 183 

I III II ! II III I II II II HI I I I I I I I I I I 
Db 1386 aataaaccaggctcttggaagtttggttcaatgttttgagtgggaaagagttggtgagga 1445 

Qy 184 tcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttggaggc 243 

I II I I II I I I I I 1 I I I I I I I I II 

Db 14 4 6 ttttgtggacatgaccgaagacaaaggagccacattgcccaaagctataccattaagagc 1505 

Qy 244 catgtgcangccgcgtac 261 

I I I I I I I I I III I 
Db 1506 catgtgcaaagcacgttc 1523 



RESULT 12 
AAC47416 

ID AAC47416 standard; DNA; 1576 BP. 
XX 

AC AAC47416; 
XX 

DT 18-OCT-2000 (first entry) 



XX 

DE Arabidopsis thaliana DNA fragment SEQ ID NO: 53738. 
XX 

KW Hybridisation assay; genetic mapping; gene expression control; 
KW protein identification; signal transduction pathway; 
KW metabolic pathway; promoter; termination sequence; ss. 
XX 

OS Arabidopsis thaliana. 
XX 

PN EP1033405-A2. 
XX 
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Query Match 11.3%; Score 48.8; DB 21; Length 1576; 

Best Local Similarity 50.0%; Pred. No. 7.1e-05; 

Matches 119; Conservative 0; Mismatches 119; Indels 0; Gaps 

Qy 31 cgatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcc 90 

I I I I I III III II I I I I I I I I I I I I I I I I I I I 
Db 12 63 cggtggagaaggagaaaaagatgatgttcgtatgctgatagcgtttggaagcggacggag 1322 

Qy 91 caattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgct 150 

I I I I I I I I I I I I I I I II II I I I I I I I I I I 

Db 1323 aatatgtcccggtgttggactagcgcacaagattgtgacattagcgttaggatcgttaat 1382 

Qy 151 caattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcgg 210 

I I I I I I I I I I I I I I I I III MM I II 

Db 1383 tcaatgctttgattggaaaaaggtgaacgaaaaagagattgatatgagtgagggtccggg 14 42 

Qy 211 gctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatg 268 

I I I I I I II I . II II I I I I I I I I I I I I I I I I III 
Db 14 4 3 gatggctatgcgtatgatggtgccgttacgagccttgtgtaagactcgacccataatg 1500 



RESULT 13 
AAC35968 

ID AAC35968 standard; DNA; 1578 BP. 
XX 

AC AAC35968; 
XX 

DT 17-OCT-2000 (first entry) 
XX 

DE Arabidopsis thaliana DNA fragment SEQ ID NO: 12072. 
XX 

KW Hybridisation assay; genetic mapping; gene expression control; 



KW protein identification; signal transduction pathway; 

KW metabolic pathway; promoter; termination sequence; ss. 
XX 

OS Arabidopsis thaliana. 
XX 

PN EP1033405-A2. 
XX 
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Query Match 11.3%; Score 48.8; DB 21; Length 1578; 

Best Local Similarity- 50.0%; Pred. No. 7.1e-05; 
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Best Local Similarity 52.0%; Pred. No. 0.00028; 

Matches 102; Conservative 0; Mismatches 94; Indels 0; Gaps 0; 
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Db 1266 gctaatgccgttcgggctaggaagaagggcatgtcctggatccggtttggctcagcggct 1325 
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RESULT 15 
AAC47053 

ID AAC47053 standard; DNA; 1519 BP. 
XX 

AC AAC47053; 
XX 

DT 18-OCT-2000 (first entry) 
XX 

DE Arabidopsis thaliana DNA fragment SEQ ID NO: 52389. 
XX 

KW Hybridisation assay; genetic mapping; gene expression control; 

KW protein identification; signal transduction pathway; 

KW metabolic pathway; promoter; termination sequence; ss. 
XX 

OS Arabidopsis thaliana. 
XX 

PN EP1033405-A2 . 
XX 

PD 06-SEP-2000. 
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10.1%; 



Best Local Similarity 49.8%; 
Matches 108; Conservative 



Score 43.8; DB 21; Length 1519; 
Pred. No. 0.0023; 
0; Mismatches 109; Indels 0; 



Gaps 



Qy 64 gctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcggac 123 

III I I I I I I III I I I I I I I I I I I I III 

Db 12 97 gcttctggcgtttggattaggtagaagagcgtgtcctggatcgggtctggcccaacgaat 1356 

Qy 124 cgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatggagc 183 

II I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I 
Db 1357 cgtgggactagctctcgggtcattgatacaatgctttgaatgggagagagttgggaatgt 1416 

Qy 184 tcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttggaggc 24 3 

III II I I I I I I I I II I I I I I I I I I I I 

Db 1417 ggaagtggatatgaaggaaggagttgggaatactgtacccaaagcgattcctttgaaagc 147 6 

Qy 244 catgtgcangccgcgtacagctatgcgtggtgttctt 280 

MINI I I I I I I I I I I I I I I 
Db 1477 tatttgcaaagctcgtccatttctacataagattatt 1513 



Search completed: February 7, 2002, 11:00:07 
Job time : 4 993 sec 

GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



February 7, 2002, 11:12:10 ; Search time 172.96 Seconds 

(without alignments) 
565.671 Million cell updates/s 

US-09-394-745-6514 
432 

1 gtccagcagctcggacttac attttctttttttttcttgg 432 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 



351203 seqs, 113238999 residues 



Total number of hits satisfying chosen parameters: 



702406 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database 



Issued__Patents_NA : * 

1 : /cgn2_6/ptodata/2/ina/5A_COMB. seq: * 

2 : /cgn2_6/ptodata/2/ina/5B_COMB. seq: * 

3 : /cgn2_6/ptodata/2/ina/6A_COMB. seq: * 

4 : /cgn2_6/ptodata/2/ina/6B_COMB. seq: * 

5: /cgn2_6/ptodata/2/ina/PCTUS_COMB.seq: * 

6 : /cgn2_6/ptodata/2/ina/backf ilesl .seq: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 



No . 


Score 


Match 


Length 


DB 


ID 


Description 


1 


by . 6 


13 


.8 


r o o 


I 


US-Uo-9/o-oId-I / 


Sequence 


17, Appl 


2 


42 


9 


.7 


1506 


4 


US-09-158-767-7 


Sequence 


7, Appli 


3 


42 


9 


.7 


1506 


4 


US-09-158-767-8 


Sequence 


8, Appli 


4 


42 


9 


.7 


1506 


4 


US-09-158-767-9 


Sequence 


9, Appli 


5 


42 


9 


.7 


2261 


4 


US-09-158-767-1 


Sequence 


1, Appli 


■ 6 


38. 4 


8 


. 9 


1762 


3 


US-08-881-784-5 


Sequence 


5, Appli 


7 


38.4 


8 


. 9 


1762 


4 


US-09-292-768-1 


Sequence 


1, Appli 


8 


38.4 


8 


. 9 


1762 


4 


US-09-292-768-63 


Sequence 


63, Appl 


9 


38.4 


8 


. 9 


1762 


4 


US-09-292-768-65 


Sequence 


65, Appl 


10 


38.4 


8 


.9 


1762 


4 


US-09-172-339-5 


Sequence 


5, Appli 


11 


37 


8 


. 6 


1929 


4 


US-09-380-420C-1 


Sequence 


1, Appli 


12 


36 


8 


.3 


8438 


1 


US-07-945-283-1 


Sequence 


1, Appli 


13 


35. 4 


8 


.2 


6387 


1 


US-07-721-775A-1 


Sequence 


1, Appli 


14 


35. 4 


8 


.2 


6387 


1 


US-08-339-658-1 


Sequence 


1, Appli 


15 


34 . 6 


8 


.0 


1219 


4 


US-09-025-819-28 


Sequence 


28, Appl 


16 


34 . 6 


8 


.0 


11220 


4 


US-09-105-537-32 


Sequence 


32, Appl 


17 


34 . 6 


8 


.0 


36778 


4 


US-09-105-537-5 


Sequence 


5, Appli 


18 


34 . 6 


8 


.0 


38506 


3 


US-09-320-878-19 


Sequence 


19, Appl 


19 


33. 6 


7 


.8 


996 


4 


US-09-025-819-1 


Sequence 


1, Appli 


20 


33. 6 


7 


.8 


1515 


4 


US-09-292-768-5 


Sequence 


5, Appli 


21 


33. 6 


7 


.8 


1665 


3 


US-08-881-784-8 


Sequence 


8, Appli 


22 


33. 6 


7 


.8 


1665 


4 


US-09-292-768-3 


Sequence 


3, Appli 


23 


33. 6 


7 


.8 


1665 


4 


US-09-292-768-67 


Sequence 


67, Appl 


24 


33. 6 


7 


.8 


1665 


4 


US-09-292-768-69 


Sequence 


69, Appl 


25 


33 


7 


. 6 


1893 


1 


US-08-532-065B-1 


Sequence 


1, Appli 


26 


33 


7 


. 6 


4403765 


4 


US-09-103-840A-2 


Sequence 


2, Appl 


27 


33 


7 


.6 


4411529 


4 


US-09-103-840A-1 


Sequence 


1, Appl 


28 


32.6 


7 


. 5 


43280 


2 


US-08-804-227C-1 


Sequence 


1, Appli 


29 


32.2 


7 


.5 


1656 


4 


US-09-385-028-14 


Sequence 


14, Appl 


30 


32.2 


7 


.5 


15079 


4 


US-09-385-028-1 


Sequence 


1, Appli 


31 


32 


7 


. 4 


801 


2 


US-08-975-316-50 


Sequence 


50, Appl 



c 



32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 



31.8 
31.4 
31.4 
31.2 
31.2 
31.2 
30.4 
30.4 
30.4 
30.4 
30.4 
30.4 
30.4 
30.4 



7.0 1162 2 US-08-726-306A-52 

7.0 1575 2 US-08-811-897A-34 

7.0 1575 2 US-08-855-213-34 

7.0 1656 2 US-08-811-897A-36 

7.0 1656 2 US-08-855-213-36 

7.0 1659 2 US-08-811-897A-35 

7.0 1659 2 US-08-811-897A-37 

7.0 1659 2 US-08-855-213-35 



7.4 1518 1 US-08-148-215A-3 

7.3 4403765 4 US-09-103-840A-2 

7.3 4411529 4 US-09-103-840A-1 

7.2 461 2 US-08-825-556A-1 

7.2 1269 1 US-08-396-218-1 



7.2 1269 1 US-08-760-116-1 



Sequence' 3, Appli 
Sequence 2, Appl 
Sequence 1, Appl 
Sequence 1, Appli 
Sequence 1, Appli 
Sequence 1, Appli 
Sequence 52, Appl 
Sequence 34, Appl 
Sequence 34, Appl 
Sequence 36, Appl 
Sequence 36, Appl 
Sequence 35, Appl 
Sequence 37, Appl 
Sequence 35, Appl 



ALIGNMENTS 



RESULT 1 
US-08-975-316-17 

; Sequence 17, Application US/08975316 
; Patent No. 5952486 
; GENERAL INFORMATION: 

APPLICANT: BLOKSBERG, Leonard N., HAVUKKALA, Ilkka 

APPLICANT: and GRIERSON, Alastair W. 

TITLE OF INVENTION: MATERIALS AND METHODS FOR 

TITLE OF INVENTION: THE MODIFICATION OF PLANT LIGNIN CONTENT 
NUMBER OF SEQUENCES: 88 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Law Offices of Ann W. Speckman 

STREET: 2601 Elliott Avenue, Suite 4185 

CITY: Seattle 

STATE: WA 

COUNTRY: USA 

ZIP: 98121 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/97 5,316 

FILING DATE: 

CLASSIFICATION: 800 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/713,000 

FILING DATE: September 11, 1996 
ATTORNEY/AGENT INFORMATION: 

NAME: SLEATH, Janet 

REGISTRATION NUMBER: 37,007 

REFERENCE/DOCKET NUMBER: 11000/1003C1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 20 6-2 69-0565. 

TELEFAX: 206-269-0563 

TELEX: 

; INFORMATION FOR SEQ ID NO: 17: 



SEQUENCE CHARACTERISTICS: 
LENGTH: 622 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
US-08-975-316-17 



Query Match 13.8%; Score 59.6; DB 2; Length 622; 

Best Local Similarity 54.1%; Pred. No. le-08; 

Matches 119; Conservative 0; Mismatches 101; Indels 0; Gaps 0; 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 180 CCGACTATTGCCGTTTGGGATGGGGAGGAGAAGTTGTCCTGGTGCTGGCCTTGCCAATAG 239 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacggttgatgg 180 

II I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I 
Db 240 AGTGGTGAGCTTGGTCCTGGCGGCGCTTATTCAGTGCTTCGAATGGGAACGAGTTGGCGA 299 

Qy 181 agctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcccgttgga 24 0 

II III I I I I I I I I I I I I I I I I I I I I I 

Db 300 AGAATTGGTGGACTTGTCCGAGGGGACGGGACTCACAATGCCAAAGAGAGAGCCATTGGA 359 

Qy 241 ggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

I I I I II I I I I I I I I I I III II III 
Db 360 GGCCTTGTGCAAAGCGCGTGAATGCATGATAGCTAATGTT 399 



RESULT 2 
US-09-158-767-7 

; Sequence 7, Application US/09158767A 

; Patent No. 6180363 

; GENERAL INFORMATION: 

; APPLICANT: Batard, Yannick 

; APPLICANT: Durst, Francis 

; APPLICANT: Schalk, Michel 

APPLICANT: Werck-Reichhart , Daniele 
; TITLE OF INVENTION: RECODING OF DNA SEQUENCES PERMITTING 
; TITLE OF INVENTION: EXPRESSION IN YEAST AND OBTAINED TRANSFORMED YEAST 
; FILE REFERENCE: A32000 

; CURRENT APPLICATION NUMBER: US/0 9/1 58 , 7 67A 

; CURRENT FILING DATE: 1998-09-23 

; EARLIER APPLICATION NUMBER: FR 97-12094 

; EARLIER FILING DATE: 1997-09-24 

; NUMBER OF SEQ ID NOS : 20 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 7 

LENGTH: 1506 

TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Altered sequences 
US-09-158-767-7 



Query Match 



9.7%; Score 42; DB 4; Length 1506; 



Best Local Similarity 53.7%; Pred. No. 0.0027; 

Matches 87; Conservative 0; Mismatches 75; Indels 0; Gaps 0; 



Qy 


1 


gtccagcagctcggacttacccggccggttcgatggctccggcggcaaggccaaggggcc 

Mil! i 1 1 1 1 1 1 III II 1 1 1 1 1 1 1 1 
II 1 II i 1 1 1 1 1 1 III II 1 1 1 1 1 1 1 1 

gttcaggccggagaggttcctcgaggaggagaaggccgtcgaggcccacggcaacgattt 


60 


Db 


1233 


1292 


Qy 


61 


cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 
1 1 1 1 1 1 1 1 1 1 1 1 I II II 1 1 1 1 1 1 1 I 1 II 1 1 1 1 1 1 1 1 1 1 1 
ccggttcgtgcccttcggcgtcggccgccggagctgccccgggatcatcctcgcgctgcc 


120 


Db 


1293 


1352 


Qy 


121 


gaccgtcgggctggtgctcgcaacgctgctcaattgcttcga 162 

1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
catcatcggcatcacgctcggacgcctggtgcagaacttcca 1394 




Db 


1353 





RESULT 3 
US-09-158-767-8 

; Sequence 8, Application US/09158767A 

; Patent No. 6180363 

; GENERAL INFORMATION: 

; APPLICANT: Batard, Yannick 

; APPLICANT: Durst, Francis 

; APPLICANT: Schalk, Michel 

APPLICANT: Werck-Reichhart , Daniele 
; TITLE OF INVENTION: RECODING OF DNA SEQUENCES PERMITTING 

; TITLE OF INVENTION: EXPRESSION IN YEAST AND OBTAINED TRANSFORMED YEAST 
; FILE REFERENCE: A32000 

; CURRENT APPLICATION NUMBER: US/09/158 , 767A 

; CURRENT FILING DATE: 1998-09-23 

; EARLIER APPLICATION NUMBER: FR 97-12094 

; EARLIER FILING DATE: 1997-09-24 

; NUMBER OF SEQ ID NOS : 20 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ -ID NO 8 

LENGTH: 1506 

TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Altered sequences 
US-09-158-767-8 



Query Match 9.7%; Score 42; DB 4; Length 1506; 

Best Local Similarity 53.7%; Pred. No. 0.0027; 

Matches 87; Conservative 0; Mismatches 75; Indels 0; Gaps 0; 

Qy 1 gtccagcagctcggacttacccggccggttcgatggctccggcggcaaggccaaggggcc 60 

I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1233 gttcaggccggagaggttcctcgaggaggagaaggccgtcgaggcccacggcaacgattt 1292 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 1293 ccggttcgtgcccttcggcgtcggccgccggagctgccccgggatcatcctcgcgctgcc 1352 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcga 162 
I I I I I I I I I I I I I I I I I I MM I 



Db 1353 catcatcggcatcacgctcggacgcctggtgcagaacttcca 1394 



RESULT 4 
US-09-158-767-9 

; Sequence 9, Application US/09158767A 

; Patent No. 6180363 

; GENERAL INFORMATION: 

; APPLICANT: Batard, Yannick 

; APPLICANT: Durst, Francis 

; APPLICANT: Schalk, Michel 

APPLICANT: Werck-Reichhart , Daniele 
; TITLE OF INVENTION: RECODING OF DNA SEQUENCES PERMITTING 
; TITLE OF INVENTION: EXPRESSION IN YEAST AND OBTAINED TRANSFORMED YEAST 
; FILE REFERENCE: A32000 

; CURRENT APPLICATION NUMBER: US/09/158, 767A 

; CURRENT FILING DATE: 1998-09-23 

; EARLIER APPLICATION NUMBER: FR 97-12094 

; EARLIER FILING DATE: 1997-09-24 

; NUMBER OF SEQ ID NOS : 20 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 9 

LENGTH: 1506 

TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE: 

; OTHER INFORMATION: Altered sequences 
US-09-158-767-9 



Query Match 9.7%; Score 42; DB 4; Length 1506; 

Best Local Similarity 53.7%; Pred. No. 0.0027; 

Matches ■ 87; Conservative 0; Mismatches 75; Indels 0; Gaps 0; 



Qy 


1 


gtccagcagctcggacttacccggccggttcgatggctccggcggcaaggccaaggggcc 

1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 I 1 1 1 1 1 1 1 
gttcaggccggagaggttcctcgaggaggagaaggccgtcgaggcccacggcaacgattt 


60 


Db 


1233 


1292 


Qy 


61 


cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 
ccggttcgtgcccttcggcgtcggccgccggagctgccccgggatcatcctcgcgctgcc 


120 


Db 


1293 


1352 


Qy 


121 


gaccgtcgggctggtgctcgcaacgctgctcaattgcttcga 162 

1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 I 1 1 
catcatcggcatcacgctcggacgcctggtgcagaacttcca 1394 




Db 


1353 





RESULT 5 
US-09-158-767-1 

; Sequence 1, Application US/09158767A 

; Patent No. 6180363 

; GENERAL INFORMATION: 

; APPLICANT: Batard, Yannick 

APPLICANT: Durst, Francis 
; APPLICANT: Schalk, Michel 
; APPLICANT: Werck-Reichhart , Daniele 

; TITLE OF INVENTION : RECODING OF DNA SEQUENCES PERMITTING 



; TITLE OF INVENTION: EXPRESSION IN YEAST AND OBTAINED TRANSFORMED YEAST 

FILE REFERENCE: A32000 
; CURRENT APPLICATION NUMBER: US/09/158, 767A 
; CURRENT FILING DATE: 1998-09-23 
/ EARLIER APPLICATION NUMBER: FR 97-12094 
; EARLIER FILING DATE: 1997-09-24 
; NUMBER OF SEQ ID NOS : 20 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 1 

LENGTH: 22 61 

TYPE: DNA 
; ORGANISM: Triticum aestivum 
US-09-158-767-1 



Query Match 9.7%; Score 42; DB 4; Length 2261; 

Best Local Similarity 53.7%; Pred. No. 0.0032; 

Matches 87; Conservative 0; Mismatches 75; Indels 0; Gaps 0; 

Qy 1 gtccagcagctcggacttacccggccggttcgatggctccggcggcaaggccaaggggcc 60 

Mill I I I I I I I I I I I I I I I I I I I I 

Db 1281 gttcaggccggagaggttcctcgaggaggagaaggccgtcgaggcccacggcaacgattt 134 0 

Qy 61 cctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcg 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 1341 ccggttcgtgcccttcggcgtcggccgccggagctgccccgggatcatcctcgcgctgcc 1400 

Qy 121 gaccgtcgggctggtgctcgcaacgctgctcaattgcttcga 162 

I I I I I I I I I I I I I I I I I I I I II I 
Db 14 01 catcatcggcatcacgctcggacgcctggtgcagaacttcca 14 4 2 

RESULT 6 
US-08-881-784-5 

Sequence 5, Application US/08881784 
Patent No. 6083731 
GENERAL INFORMATION: 

APPLICANT: Croteau, Rodney B. 
APPLICANT: Lupien, Shari L. 
APPLICANT: Karp, Frank 

TITLE OF INVENTION: RECOMBINANT MATERIALS AND METHODS FOR 
TITLE OF INVENTION: THE PRODUCTION OF LIMONENE HYDROXYLASES 
NUMBER OF SEQUENCES: 58 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Christensen, O'Connor, Johnson and Kindness 
ADDRESSEE: PLLC 

STREET: 1420 Fifth Avenue, Suite 2800 
CITY: Seattle 
STATE : WA 
COUNTRY: USA 
ZIP : 98101 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS /MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 



APPLICATION NUMBER: US/08/881,784 

FILING DATE: 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: Shelton, Dennis K. 

REGISTRATION NUMBER: 26,997 

REFERENCE/DOCKET NUMBER: WSUR19777 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (206) 224-0718 

TELEFAX: (206) 224-0779 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 17 62 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
ORIGINAL SOURCE: 

ORGANISM: Mentha spicata 

INDIVIDUAL ISOLATE: cDNA encoding 

INDIVIDUAL ISOLATE: 
IMMEDIATE SOURCE: 

CLONE: pSM12.2 
FEATURE: 

NAME /KEY : mis cofeature 

LOCATION: 558.. 1212 

OTHER INFORMATION: /product= "Probe LH-1 (Figure 4A) " 
FEATURE: 

NAME/KEY: misc_f eature 
LOCATION: 39.. 538 

OTHER INFORMATION: /product^ "Probe LH-2 (Figure 4A) " 
US-08-881-784-5 



Query Match 8.9%; Score 38.4; DB 3; Length 1762; 

Best Local Similarity 55.1%; Pred. No. 0.034; 

Matches 75; Conservative 0; Mismatches 61; Indels 0; Gaps 0; 

Qy 4 6 caaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccccgggga 105 

II II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 127 6 CATGGGAAACGATTTCGAGTTCATCCCATTCGGGGCGGGTCGAAGAATCTGCCCCGGTTT 1335 

Qy 106 aacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactg 165 

I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I 

Db 1336 ACATTTCGGGCTGGCAAATGTTGAGATCCCATTGGCGCAACTGCTCTACCACTTCGACTG 1395 

Qy 166 ggacacggttgatgga 181 

II I I I I I 

Db 1396 GAAATTGCCACAAGGA 1411 



RESULT 7 
US-09-292-768-1 

; Sequence* 1, Application US/09292768 
; Patent No. 6194185 
; GENERAL INFORMATION: 

APPLICANT: Croteau, Rodney B 



; APPLICANT: Lupien, Shari L 
; APPLICANT: Karp, Frank 

; TITLE OF INVENTION: RECOMBINANT MATERIALS AND METHODS FOR THE PRODUCTION OF 
; TITLE OF INVENTION: LIMONENE HYDROXYLASES 
; FILE REFERENCE: wsurl3463 

/ CURRENT APPLICATION NUMBER: US/09/2 92,7 68 

; CURRENT FILING DATE: 1999-04-14 

; EARLIER APPLICATION NUMBER: 08/881,784 

; EARLIER FILING DATE: 1997-06-24 

; NUMBER OF SEQ ID NOS : 70 

; SOFTWARE: Patent In Ver. 2.0 

; SEQ ID NO 1 

LENGTH: 17 62 

TYPE: DNA 

ORGANISM: Mentha spicata 
FEATURE : 
NAME /KEY : CDS 
LOCATION: (20) . . (1507) 
US-09-292-768-1 



Query Match 8.9%; Score 38.4; DB 4; Length 1762; 

Best Local Similarity 55.1%; Pred. No. 0.034; 

Matches 7 5;' Conservative 0; Mismatches .61; Indels 0; Gaps 0; 

Qy 46 caaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccccgggga 105 

I I I 1 III I I I I I II I II I I I I I I I I I I I I I I I I I I 
Db 1276 catgggaaacgatttcgagttcatcccattcggggcgggtcgaagaatctgccccggttt 1335 

Qy 106 aacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactg 165 

I I I I I I I I I I I I I I III II I I I I I I I II I I I I I 

Db 1336 acatttcgggctggcaaatgttgagatcccattggcgcaactgctctaccacttcgactg 1395 

Qy 166 ggacacggttgatgga 181 

II I I I I I 

Db 1396 gaaattgccacaagga 1411 



RESULT 8 
US-09-292-768-63 

; Sequence 63, Application US/09292768 

; Patent No. 6194185 

; GENERAL INFORMATION: 

;. APPLICANT: Croteau, Rodney B 

APPLICANT: Lupien, Shari L 
; APPLICANT: Karp, Frank 

; TITLE OF INVENTION: RECOMBINANT MATERIALS AND METHODS FOR THE PRODUCTION OF 
; TITLE OF INVENTION: LIMONENE HYDROXYLASES 
; FILE REFERENCE: wsurl3463 

; CURRENT APPLICATION NUMBER: US/09/292,7 68 
; CURRENT FILING DATE: 1999-04-14 
; EARLIER APPLICATION NUMBER: 08/881,784 
; EARLIER FILING DATE: 1997-06-24 
; NUMBER OF SEQ ID NOS: 70 
; SOFTWARE: Patent In Ver. 2.0 
; SEQ ID NO 63 
LENGTH: 1762 



TYPE: DNA 

ORGANISM: Artificial Sequence 
FEATURE : 

OTHER INFORMATION: Description of Artificial Sequence: 
/ OTHER INFORMATION: computer-generated nucleic acid sequence 

FEATURE: 

NAME/KEY: CDS 

LOCATION: (20) . . (1507) 
; OTHER INFORMATION: Computer-generated nucleic acid sequence encoding 
; OTHER INFORMATION: limonene-6-hydroxylase variant 
US-09-292-768-63 



Query Match 8.9%; Score 38.4; DB 4; Length 1762; 

Best Local Similarity 55.1%; Pred. No. 0.034; 



Matches 


75; Conservative 0; Mismatches 61; Indels 0; Gaps 


Qy 


46 


caaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccccgggga 


105 






1 1 1 1 Ml 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1276 


catgggaaacgatttcgagttcatcccattcggggcgggtcgaagaatctgccccggttt 


1335 


Qy 


106 


aacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactg 


165 






1 1 1 i 1 1 1 1 1 Mill III 1 1 II M 1 M 1 1 1 1 1 1 1 




Db 


1336 


acatttcgggctggcaaatgttgagatcccattggcgcaactgctctaccacttcgactg 


1395 


Qy 


166 


ggacacggttgatgga 181 
II 1 1 M 1 




Db 


1396 


gaaattgccacaagga 1411 





RESULT 9 
US-09-292-768-65 

; Sequence 65, Application US/09292768 

; Patent No. 6194185 

; GENERAL INFORMATION: 

; APPLICANT: Croteau, Rodney B 

APPLICANT: Lupien, Shari L 
; APPLICANT: Karp, Frank 

; TITLE OF INVENTION: RECOMBINANT MATERIALS AND METHODS FOR THE PRODUCTION 
; TITLE OF INVENTION: LIMONENE HYDROXYLASES 
; FILE REFERENCE: wsurl3463 

; CURRENT APPLICATION NUMBER: US /0 9/2 92 , 7 68 

; CURRENT FILING DATE: 1999-04-14 

; EARLIER APPLICATION NUMBER: 08/881,784 

; EARLIER FILING DATE: 1997-06-24 

; NUMBER OF SEQ ID NOS : 70 

; SOFTWARE: Patent In Ver. 2.0 

; SEQ ID NO 65 

LENGTH: 17 62 

TYPE: DNA 
; ORGANISM: Artificial Sequence 

FEATURE: 

; OTHER INFORMATION: Description of Artificial Sequence: 
; OTHER INFORMATION: computer-generated nucleic acid sequence encoding 
; OTHER INFORMATION: 1 imonene- 6-hydroxylase 
FEATURE: 

NAME/KEY: misc feature 



LOCATION: (1) . . (17 62) 
; OTHER INFORMATION: computer-generated nucleic acid sequence encoding 
; OTHER INFORMATION : spearmint limonene-6-hydroxylase variant 

FEATURE: 

NAME/KEY: CDS 

LOCATION: (20) . . (1507) 
US-09-292-768-65 



Query Match 8.9%; Score 38.4; DB 4; Length 1762; 

Best Local Similarity 55.1%; Pred. No. 0.034; 

Matches 75; Conservative 0; Mismatches 61; Indels 0; Gaps 0; 

Qy 46 caaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccccgggga 105 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1276 catgggaaacgatttcgagttcatcccattcggggcgggtcgaagaatctgccccggttt 1335 

Qy 106 aacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactg 165 

I II I I I I I I I I I I I III I I I I I I I MINIM 

Db 1336 acatttcgggctggcaaatgttgagatcccattggcgcaactgctctaccacttcgactg 1395 

Qy 166 ggacacggttgatgga 181 

I I I I I I I 
Db 1396 gaaattgccacaagga 1411 



RESULT 10 
US-09-172-339-5 

; Sequence 5, Application US/09172339 
; Patent No. 6291745 
; GENERAL INFORMATION: 

APPLICANT: EuClaire Meyer, Terry 
; APPLICANT: Yalpani, Nasser 

TITLE OF INVENTION: Limonene and Other Downstream 

TITLE OF INVENTION: Metabolites of Geranyl Pyrophosphate for Insect Control 

in 

; TITLE OF INVENTION: Plants 
; FILE REFERENCE: 5718-65 

; CURRENT APPLICATION NUMBER: US/09/172,339 

; CURRENT FILING DATE: 1998-10-14 

; EARLIER APPLICATION NUMBER: 08/449,061 

; EARLIER FILING DATE: 1995-05-24 

; EARLIER APPLICATION NUMBER: 08/153,544 

; EARLIER FILING DATE: 1993-11-16 

; EARLIER APPLICATION NUMBER: 08/042,199 

; EARLIER FILING DATE: 1993-04-02 

; NUMBER OF SEQ ID NOS : 8 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 5 

LENGTH: 17 62 

TYPE: DNA 

ORGANISM: Mentha spicata 
FEATURE: 

NAME/KEY: misc_feature 
LOCATION: (0) ... (0) 

OTHER INFORMATION: Carveol Synthase 
FEATURE: 



NAME/KEY: CDS 
LOCATION: (20) . . . (1507) 
US-09-172-339-5 



Query Match 8.9%; Score 38.4; DB 4; Length 17 62; 

Best Local Similarity 55.1%; Pred. No. 0.034; 



Matches 


75; Conservative 0; Mismatches 61; Indels 0; Gaps 


Qy 

Db 


46 
1276 


caaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccccgggga 

1 1 1 1 1 1 I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 
catgggaaacgatttcgagttcatcccattcggggcgggtcgaagaatctgccccggttt 


105 
1335 


Qy 


106 


aacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactg 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
acatttcgggctggcaaatgttgagatcccattggcgcaactgctctaccacttcgactg 


165 


Db 


1336 


1395 


Qy 


166 


ggacacggttgatgga 181 
II 1 1 1 1 1 




Db 


1396 


gaaattgccacaagga 1411 





RESULT 11 
US-09-380-420C-1 

; Sequence 1, Application US/09380420C 
; Patent No. 6300544 

GENERAL INFORMATION: 

APPLICANT: Halkier, Barbara 
; Bak, Soren 

; Kahn, Rachel 

; Moller, Birger 

; TITLE OF INVENTION: Cytochrome P450 Monooxygenases 

NUMBER OF SEQUENCES: 23 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Syngenta Patent Dept. 

STREET: 3054 Cornwallis Road 

CITY: RTP 

STATE: NC 

COUNTRY: USA 

ZIP: 27709 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/380 , 4 20C 

FILING DATE: 12-No. 6300544-1999 

CLASSIFICATION: <Unknown> 
ATTORNEY/AGENT INFORMATION: 

NAME: Meigs, J. Timothy 

REGISTRATION NUMBER: 38,241 

REFERENCE/DOCKET NUMBER: S-21251A 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 919-541-8587 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 



LENGTH: 1929 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
IMMEDIATE SOURCE: 

CLONE: P450ox 
FEATURE : 

NAME /KEY : CDS 

LOCATION: 81.. 1673 
SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-09-380-420C-1 



Query Match 8.6%; Score 37; DB 4; Length 1929; 

Best Local Similarity 54.9%; Pred. No. 0.092; 

Matches 73; Conservative 0; Mismatches 60; Indels 0; Gaps 

Qy 35 ggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaat 94 

II III I I I I I I I I I I I I I I I I II I I I I I I I 

Db 1422 GACGTCGACTACTACGGCTCGCACTTCGAGCTCATACCGTTCGGGGCCGGCCGCCGGATC 14 81 

Qy 95 tgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaat 154 

I I I I I I I II II I I I I I I I I I I I I I I I I I I I I 

Db 14 82 TGCCCGGGACTCACCATGGGCGAGACCAACGTCACCTTCACCCTCGCCAACCTGCTCTAC 1541 

Qy 155 tgcttcgactggg 167 

I I I I I I I I I I I I 
Db 1542 TGCTACGACTGGG 1554 



RESULT 12 
US-07-945-283-1/C 

; Sequence 1, Application US/07945283 

; Patent No. 5352596 

; GENERAL INFORMATION: 

APPLICANT: Cheung, Andrew K. 

APPLICANT: Wesley, Ronald D. 

TITLE OF INVENTION: Pseudorabies Virus Deletion Mutants 
TITLE OF INVENTION: Involving The EP0 and LLT Genes 
NUMBER OF SEQUENCES: 7 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Curtis P. Ribando 

STREET: 1815 No. 5352596th University Street 
; CITY: Peoria 

STATE: IL 

COUNTRY: USA 

ZIP : 61604 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07/94 5,283 

FILING DATE: 19920911 

CLASSIFICATION: 424 



ATTORNEY /AGENT INFORMATION: 
NAME: Ribando, Curtis P 
REGISTRATION NUMBER: 27976 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 309-685-4011 ext. 513 
TELEFAX: 30 9-685-4128 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 8438 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
HYPOTHETICAL: NO 
ANTI-SENSE : NO 
ORIGINAL SOURCE: 

ORGANISM: Pseudorabies virus 
FEATURE: 

NAME /KEY : CDS 
LOCATION: 622.. 64 95 
FEATURE: 

NAME/KEY: variation 
LOCATION: replace ( 1099, "g") 
FEATURE : 

NAME/KEY : variation 
LOCATION: replace ( 12 67 , "t") 
FEATURE: 

NAME/KEY : variation 
LOCATION: replace ( 138 1 , "c") 
FEATURE: 

NAME/KEY: variation 
LOCATION: replace ( 1566, "c") 
FEATURE: 

NAME/KEY : variation 
LOCATION: replace ( 7010 , "g") 
US-07-945-283-1 



Query Match 8.3%; Score 36; DB 1; Length 8438; 

Best Local Similarity 50.0%; Pred. No. 0.34; 

Matches 90; Conservative 0; Mismatches 90; Indels 0; Gaps 0; 

Qy 22 cggccggttcgatggctccggcggcaaggccaaggggcccctgctgatccctttcgggat 81 

I I I I II I II I I II II I I I I I I I I I III I I I I I 

Db 4708 CGCCTGCGTCCTGGCCTGCCGCGGCGTCCTCGAGCGCCTGCTGCCCTGCCCGCTCCGGCT 4 64 9 

Qy 82 ggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgc 141 

I I I I I I I I I II I I I I I I I I I I I I MM I II 

Db 4 64 8 GCCCGCGCCCGCCCGCGCCCCGGCCGCCCTCGGGCCCGCCTGCCTCGAGGAGGTGACCGC 4 58 9 

Qy 142 aacgctgctcaattgcttcgactgggacacggttgatggagctcaggtttgacatgaagc 201 

I II I II I I I MM I III I II I I I II III 

Db 4 588 CGCGCTGCTCGCGCTCCGCGACGCGATCCCCGGGGCCGGCCCGGCCGAGCGGCAGCAGGC 4 52 9 



RESULT 13 
US-07-721-775A-1/C 



Sequence 1, Application US/07721775A 
Patent No. 5180666 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 



States, J. Christopher 
Hines, Ronald N. 
No. 5180666ak, Raymond F. 
TITLE OF INVENTION: METHOD AND CELL LINE FOR TESTING 
TITLE OF INVENTION: MUTAGENICITY OF A CHEMICAL 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Reising, Ethington, Barnard, Perry & Milton 
STREET: P.O. Box 4390 
CITY: Troy 
STATE: Michigan 
COUNTRY: U.S.A. 
ZIP: 48099 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/07 /72 1 , 775A 
FILING DATE: 19910627 
CLASSIFICATION: 4 35 
ATTORNEY/AGENT INFORMATION: 
NAME: Kohn, Kenneth I. 
REGISTRATION NUMBER: 30,955 
REFERENCE/DOCKET NUMBER: P-321WSU 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (313) 689-3554 
TELEFAX: (313) 689-4071 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 6387 base pairs 
TYPE: NUCLEIC ACID 
STRANDEDNESS : double 
TOPOLOGY: circular 
MOLECULE TYPE: DNA (genomic) 
ORIGINAL SOURCE: 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/KEY: exon 

LOCATION: complement (2807.. 3631) 
FEATURE: 

NAME/KEY: exon 

LOCATION: complement (2125.. 2251) 
FEATURE: 

NAME/KEY: exon 

LOCATION: complement (194 8.. 2037) 
FEATURE: 

NAME/KEY: exon 

LOCATION: complement (1733.. 1856) 
FEATURE : 

NAME/KEY: exon 

LOCATION: complement (1501.. 1587) 
FEATURE : 



NAME /KEY: 
LOCATION: 
FEATURE: 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
US-07-721-775A-1 



exon 

complement 

promoter 
complement 



CDS 
4586. 



.5446 



(237. .1308) 



(3638. .3967) 



Query Match 8.2%; Score 35.4; DB 1; Length 6387; 

Best Local Similarity 48.5%; Pred. No. 0.46; 

Matches 96; Conservative 0; Mismatches 102; Indels 0; Gaps 0; 

Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1256 GATGGTGCTATCGACAAGGTGTTAAGTGAGAAGGTGATTATCTTTGGCATGGGCAAGCGG 1197 

Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 

I I 1 I I 1 I I I I I I I I I II III II I I I I I I I I I I I 
Db 1196 AAGTGTATCGGTGAGACCATTGCCCGCTGGGAGGTCTTTCTCTTCCTGGCTATCCTGCTG 1137 



Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

I I I I I I II III I I I I III III 

Db 1136 CAACGGGTGGAATTCAGCGTGCCACTGGGCGTGAAGGTGGACATGACCCCCATCTATGGG 1077 

Qy 212 ctgaccatgccccgggcc 229 

I I I I I I I I I III 
Db 107 6 C T AAC CAT G AAGC AT GCC 1059 



RESULT 14 
US-08-339-658-1/C 

; Sequence 1, Application US/08339658 
; Patent No. 5525482 
; GENERAL INFORMATION: 

.APPLICANT: States, J. Christopher 

APPLICANT: Mines, Ronald N. 

APPLICANT: No. 5525482ak, Raymond F. 

TITLE OF INVENTION: METHOD AND CELL LINE FOR TESTING 
TITLE OF INVENTION: MUTAGENICITY OF A CHEMICAL 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Reising, Ethington, Barnard, Perry & Milton 

STREET: P.O. Box 4390 

CITY: Troy 

STATE: Michigan 

COUNTRY: U.S.A. 

ZIP: 48099 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/339,658 



FILING DATE: 15-NOV-1994 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 07/990,295 

FILING DATE: 09-DEC-1992 
ATTORNEY/AGENT INFORMATION: 

NAME: Kohn, Kenneth I. 

REGISTRATION NUMBER: 30,955 

REFERENCE/DOCKET NUMBER: P-321WSU 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (313) 689-3554 

TELEFAX: (313) 689-4071 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 6387 base pairs 

TYPE: nucleic acid 

STRANDEDNESS : double 

TOPOLOGY : circular 
MOLECULE TYPE: DNA (genomic) 
ORIGINAL SOURCE: 

ORGANISM: Homo sapiens 



FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
FEATURE : 
NAME/KEY: 
LOCATION: 
US-08-339-658-1 



exon 

complement 
exon 

complement 
exon 

complement 
exon 

complement 
exon 

complement 
exon 

complement 

promoter 
complement 



[2807. .3631) 



(2125. .2251) 



(1948. .2037) 



[1733. .1856) 



[1501. .1587) 



{237. .1308) 



(3638. .3967) 



CDS 
4586. 



,5446 



Query Match 8.2%; Score 35.4; DB 1; Length 6387; 

Best Local Similarity 48.5%; Pred. No. 0.46; 

Matches 96; Conservative 0; Mismatches 102; Indels 0; Gaps 0; 

Qy 32 gatggctccggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggccc 91 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1256 GATGGTGCTATCGACAAGGTGTTAAGTGAGAAGGTGATTATCTTTGGCATGGGCAAGCGG 1197 



Qy 92 aattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctc 151 



1 1 1 1 1 1 1 1 1 1 1 I 1 1 I II III II I 1 1 1 1 I 1 1 1 1 1 

Db 1196 AAGTGTATCGGTGAGACCATTGCCCGCTGGGAGGTCTTTCTCTTCCTGGCTATCCTGCTG 1137 

Qy 152 aattgcttcgactgggacacggttgatggagctcaggtttgacatgaagctancggcggg 211 

I I I I I I II III I I I I III III 

Db 1136 CAACGGGTGGAATTCAGCGTGCCACTGGGCGTGAAGGTGGACATGACCCCCATCTATGGG 1077 

Qy 212 ctgaccatgccccgggcc 229 

II I I I I I I I III 
Db 107 6 CTAACCATGAAGCATGCC 1059 



RESULT 15 
US-09-025-819-28 

Sequence 28, Application US/09025819 
Patent No. 6225097 
GENERAL INFORMATION: 

APPLICANT: Obata, Shusei 
APPLICANT: Nishino, Tokuzo 
APPLICANT: Koyama, Tanetoshi 
APPLICANT: Sato, Yoshihiro 

TITLE OF INVENTION: DECAPRENYL DIPHOSPHATE SYNTHETASE GENE 
NUMBER OF SEQUENCES: 31 
CORRESPONDENCE ADDRESS.: 

ADDRESSEE: KENYON & KENYON 
STREET: 1500 K Street, N.W. 
CITY: Washington 
STATE : DC 
COUNTRY: USA 
ZIP: 20005 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS /MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US /0 9/02 5 , 8 1 9 
FILING DATE: 19-FEB-1998 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 251675 
FILING DATE: 17-SEP-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Khalilian, Houri 
REGISTRATION NUMBER: 39,546 
REFERENCE/DOCKET NUMBER: 10235/2 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 202-220-4 200 
TELEFAX: 202-220-4 201 
INFORMATION FOR SEQ ID NO: 28: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1219 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
FEATURE: 



NAME /KEY : CDS 
LOCATION: 151.. 1149 
US-09-025-819-28 



Query Match 8.0%; Score 34.6; DB 4; Length 1219; 

Best Local Similarity 53.3%; Pred. No. 0.4; 

Matches 73; Conservative 0; Mismatches 64; Indels 0; Gaps 

Qy 3 ccagcagctcggacttacccggccggttcgatggctccggcggcaaggccaaggggcccc 62 

II I I I I I I I I III I II I I I I I I I I I I I I I I I I I 

Db 266 CCCGCATTCCGGAAGTGACCGCGCATCTGGTCGAGGCCGGCGGCAAGCGGCTGCGGCCGA 325 

Qy 63 tgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcgcgctgcgga 122 

I I I II I I II Mill I I I I I I I I I I I I I I I I 

Db 326 TGCTGGTGCTGGCGGCGGCGCGGCTGTGCGGCTATCAGGGGAACAGCCATGTGCTGCTGG 385 

Qy 123 ccgtcgggctggtgctc 139 

III I I I E I I I 

Db 38 6 CCGCGGCGGTCGAGTTC 4 02 



Search completed: February 7, 2002, 11:12:24 
Job time: 7310 sec 

GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: February 7, 2002, 08:20:51 ; Search time 4942.22 Seconds 

(without alignments) 
939.290 Million cell updates/ 

Title: US-09-394-7 4 5-6514 

Perfect score: 432 

Sequence: • 1 gtccagcagctcggacttac attttctttttttttcttgg 432 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1 . 0 

Searched: 11351937 seqs, 5372889281 residues 

Total number of hits satisfying -chosen parameters: 22703874 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : EST:* 

1: em_estfun:* 
2 : em esthum: * 



3 


em_ 


_estin: 




4 


em 


estom: 


* 


5 


em_ 


estpl : 




6 


em_ 


_estba : 




7 


em_ 


estro : 


* 


8 


em_ 


estov: 




9 


em 


"htc: * 





10: gb__estl:* 

11: gb_est2:* 

12: gb_htc:* 

13: gb_gss:* 

14: em_gss_f un : * 

15: em_gss_hum: * 

16: em_gss_inv : * 

17: em_gss_pln:* 

18: em_gss_pro: * 

19 : em_gss_rod: * 

20: em_gss_vrt:* 

21: em_gss_other : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 





1 


179 


41 


4 


430 


10 


AW922538 


AW922538 


DG1 20 Dl 




2 


179 


41 


4 


501 


10 


BE363286 


BE363286 


WS1 61 DO 




3 


179 


41 


4 


594 


10 


BE355191 


BE355191 


DG1 10 DO 




4 


179 


41 


4 


634 


10 


BE360028 


BE360028 


DG1 60 CO 




5 


179 


41 


4 


654 


10 


BE362029 . 


BE362029 


DG1 83 E0 




6 


179 


41 


4 


693 


10 


AW676742 


AW676742 


DG1 14 AO 




7 


179 


41 
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695 


10 


BE357860 
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10 


AW922289 
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c 


15 


142.4 


33 


0 
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10 
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32 


7 


494 


10 


BE445503 


BE445503 


WHE1135 C 


c 


17 


140.2 


32 


5 


679 


10 


BE418633 


BE418633 


SCL072. F0 


c 


18 


137 .8 


31 


9 


644 


13 


AQ288789 


AQ288789 


nbxb0033H 




19 
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30 


3 
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BG464757 


BG464757 


EMI 33 GO 


c 
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1 
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AI920363 
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130.2 


30 


1 
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7 
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5 
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AW679544 
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WS1 29 DO 
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3 
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11 
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TaLrl 14 1A 
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39 


52 . 2 


12 , 


, 1 


2 66 


10 


71 71 I" AT C T C 

AA597 575 


AA597575 


2 9483 Lam 




40 


52.2 


12, 


.1 


553 


10 


BE359396 


BE359396 


DG1 4 0 FO 


c 


41 


52 


12, 


,0 


592 


10 


AW775060 


AW775060 


EST334211 




42 


51.6 


11. 


,9 


413 


10 


AA754418 


AA754418 


97MJ0362 




43 


51. 6 


11, 


.9 


587 


10 


BE364385 


BE364385 


PI1 13 Bl 




44 


51 


11. 


.8 


555 


10 


AW927862 


AW927862 


945013H06 




45 


51 


11, 


.8 


743 


11 


BI305417 


BI305417 


NLP 1 A13 



ALIGNMENTS 



RESULT 1 

AW922538 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



EST 19-JUL-2000 
(DG1) Sorghum bicolor cDNA, mRNA 



FEATURES 

source 



AW922538 430 bp mRNA 

DGl_20_D10.gl_A002 Dark Grown 
sequence. 
AW922538 

AW922538.1 GI:8088363 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 430) 

Cordonnier-Pratt , M.-M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
f L.H. 

An EST database from Sorghum: dark-grown seedlings 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 103 
High quality sequence stop: 430 
POLYA=No . 

Location/Qualif iers 
1. .430 

/organism="Sorghum bicolor" 



/db_xref="taxon:4558" 
/clone_lib="Dark Grown 1 ( DG1 ) " 

/note="Organ : 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site_l: Xhol; Site_2: EcoRI; The library was 
made from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 

BASE COUNT 76 a 106 c 155 g 93 t 

ORIGIN 



Query Match 41.4%; Score 179; DB 10; Length 430; 

Best Local Similarity 78.8%; Pred. No. 2.4e-33; 

Matches 212; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I II I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I II I III I I I I I I 
Db 27 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 8 6 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I III II 
Db 87 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 14 6 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I II I I I II I I I I I I I II I I I I II I I I I I I I I I I I I I I I 

Db 14 7 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 20 6 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I I I I I I I I I I I I I I I I I I I I II I III II I I I I I I I I I I I I I I I 
Db 207 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 266 

Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II I I I I 11 I Mill! II 

Db 2 67 ATGGAGCTCTGAGCCTCTGATGAAGAGTA 2 95 



RESULT 2 

BE363286 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



BE363286 501 bp mRNA EST 20-JUL-2000 

WSl_61_D09.gl_A002 Water-stressed 1 (WS1) Sorghum bicolor cDNA, 
mRNA sequence. 
BE363286 

BE363286.1 GI:9304843 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta ; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 501) 

Cor donnier -Pratt , M , -M . , Gingle,A., Marsala, C, Sudman,M. and Pratt 
,L.H. 

An EST database from Sorghum: water-stressed plants 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 



FEATURES 

source 



Tel: 706 542 1860 
Fax: 706 542 1805 
Email : mmpratt@uga . edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 32 
High quality sequence stop: 493 
P0LYA=Yes . 

Location/Qualifiers 

1. .501 

/organism="Sorghum bicolor" 
/db_xr e f = " t axon : 4 5 5 8 " 
/clone_lib="Water-stressed 1 (WS1)" 

/note-"Organ : Mix of 5-week old plants on days 7 & 8 after 
water was withheld; Vector: Lambda Zap; Site_l: Xhol; 
Site__2: EcoRI; The library was made from poly-A RNA in the 



BASE COUNT 
ORIGIN 



82 



cloning vector lambda ZAP II. 
prepared by mass excision." 
a 134 c 190 g 95 t 



Clones to be sequenced were 



Query Match 41.4%; Score 179; DB 10; Length 501; 

Best Local Similarity 78.8%; Pred. No. 2.4e-33; 

Matches 212; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I 
Db 136 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 195 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I II I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I III II 

Db 196 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 255 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

II I I I i I I I I I I I I I I I I I I I II I I I I I M II I I I I I I 

Db 256 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 315 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I II I I I I II I I I I I I I I I I I 

Db 316 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 375 

Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II II I II II I I I I I I II 
Db 37 6 ATGGAGCTCTGAGCCTCTGATGAAGAGTA 4 04 



RESULT 3 
BE355191 

LOCUS BE355191 594 bp mRNA EST 20-JUL-2000 

DEFINITION DGl_10_D02.gl_A002 Dark Grown 1 (DG1) Sorghum bicolor cDNA, mRNA 

sequence . 

ACCESSION BE355191 

VERSION BE355191.1 GI:9296181 

KEYWORDS EST. 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



2502, Athens, GA 30602-7271, USA 



BASE COUNT 
ORIGIN 



sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 594) 

Cordonnier-Pratt,M.-M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
,L.H. 

An EST database from Sorghum: dark-grown seedlings 
Unpublished (2000) 
Contact: Cordonnier-Pratt MM 
Department of Botany 
The University of Georgia 
Plant Sciences Building, Rm. 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 27 
High quality sequence stop: 543 
POLYA=No. . 

Location/Qualifiers 

1. .594 

/organism =ll Sorghum bicolor" 
/db_xref ="taxon : 4 558 " 
/clone_lib="Dark Grown 1 ( DG1 ) " 

/note="Organ : 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site_l : Xhol; Site_2 : EcoRI; The library was 
made from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 
100 a 172 c 213 g 109 t 



Query Match 41.4%; 
Best Local Similarity 78.8%; 
Matches 212; Conservative 



Score 179; DB 10; 
Pred. No. 2.4e-33; 
0; Mismatches 57; 



Length 594; 



Indels 



0; Gaps 



0; 



Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I 
Db 22 9 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 288 



Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I III II 
Db 28 9 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 348 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I I I I I I I I I I Mil II I I I I I I I I I I I I I I I 

Db 34 9 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 4 08 



Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgtftctt 280 

III I I I I I I I I I I I I I I I I I I I I I I II I I III III I I I I I I I I I I I I I I I I I 
Db 4 09 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 4 68 



Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II I I I I I I I I I I I I I II 
Db 4 69 ATGGAGCTCTGAGCCTCTGATGAAGAGTA 4 97 



RESULT 4 

BE360028 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



EST 20-JUL-2000 
(DG1) Sorghum bicolor cDNA, mRNA 



BASE COUNT 
ORIGIN 



BE360028 634 bp mRNA 

DG1_60_C08 .g2_A002 Dark Grown 
sequence . 
BE360028 

BE360028. 1 GI: 9301585 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 634) 

Cordonnier-Pratt,M.-M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
, L.H. 

An EST database from Sorghum: dark-grown seedlings 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16, The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 38 
High quality sequence stop: 629 
POLYA=No . 

Location/Qualifiers 
1. .634 

/organism="Sorghum bicolor" 
/db_xref="taxon: 4558" 
/clone_lib="Dark Grown 1 ( DG1 ) " 

/note="Organ : 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site_l: Xhol; Site_2: EcoRI; The library was 
made from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 
116 a 153 c 219 g 146 t 



Query Match 41.4%; Score 179; DB 10; Length 634 ; 

Best Local Similarity 78.8%; Pred. No. 2.4e-33; 

Matches 212; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I I I I I I I I II I I I I I I II I I I II I I I I I I I I I I I I I I II III I I I I I I 
Db 136 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 195 



Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I II I III II 
Db 196 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 255 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

Db 256 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 315 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I I I I I I I I I I I I I I I I I I II III III I I I I I I I II I I I I I I II 
Db 316 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 37 5 

Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II I I I I I I I Mill! II 

Db 37 6 ATGGAGCTCTGAGCCTCTGATGAAGAGTA 4 04 



RESULT 5 

BE362029 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BE362029 654 bp mRNA EST 20-JUL-2000 

DG1_83_E01 .gl_A002 Dark Grown 1 (DG1) Sorghum bicolor cDNA, mRNA 
sequence . 
BE362029 

BE362029.1 GI:9303586 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 654) 

Cordonnier-Pratt, M. -M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
, L.H. 

An EST database from Sorghum: dark-grown seedlings 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer :■ PolyTMix 
High quality sequence start: 62 
High quality sequence stop: 654 
POLYA=No . 

Location /Qualifiers 
1. .654 

/organism="Sorghum bicolor" 
/db_xref="taxon: 4558" 
/clone_lib="Dark Grown 1 ( DG1 ) " 

/note="Organ : 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site__l: Xhol; Site_2: EcoRI; The library was 



made from poly-A RNA in the cloning vector lambda ZAP II. 

Clones to be sequenced were prepared by mass excision." 
BASE COUNT 115 a 173 c 226 g 140 t 

ORIGIN 

Query Match 41.4%; Score 179; DB 10; Length 654; 

Best Local Similarity 78.8%; Pred. No. 2.4e-33; 

Matches 212; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I III I I II I I 
Db 196 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 255 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

Mill I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I III II 
Db 256 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 315 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I 

Db 316 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 37 5 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I 1 I I I I I II I I I I I I I M I I I I I III II I I I I I I I I I I I I I I I I I I 
Db 37 6 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 4 35 

Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II I I I I I I I I I I I I I II 

Db 4 36 ATGGAGCTCTGAGCCTCTGATGAAGAGTA 4 64 



RESULT 6 

AW676742 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



AW676742 693 bp mRNA EST 19-JUL-2000 

DGl_14_A08.gl_A002 Dark Grown 1 (DG1) Sorghum bicolor cDNA, mRNA 
sequence . 
AW676742 

AW676742. 1 GI:7550409 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Sorghum, 
1 (bases 1 to 693) 

Cordonnier-Pratt, M.-M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
,L.H. 

An EST database from Sorghum: dark-grown seedlings 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
•Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 



FEATURES 

source 



BASE COUNT 
ORIGIN 



below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: T7 

High quality sequence start: 7 6 
High quality sequence stop: 693 
POLYA=Yes . 

Locat ion /Qualifiers 

1. .693 

/organism= n Sorghum bicolor" 
/db_xref="taxon: 4558" 
/clone_lib="Dark Grown 1 { DG1 ) " 

/note="Organ : 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site_l : Xhol; Site_2: EcoRI; The library was 
made from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 
124 a 180 c 231 g 158 t 



Query Match 41.4%; Score 179; DB 10; Length 693; 

Best Local Similarity 78.8%; Pred. No. 2.4e-33; 

Matches 212; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I i I I I I I I I I I I I I I I I I I I I I I II I i I I I I M I I M I I I I III I I I I I I 
Db 189 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 24 8 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I II I III II 
Db 24 9 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 308 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I I I 1 I I I I I I I I I I II ! I I I II I I I I I I I I I 

Db 309 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 368 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I I I I I I I I I I I I I I I I I I II III II I I I I I I I I I I I I I I I I I I 
Db 369 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 4 28 

Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II I I II I I I II I I I I II 

Db 429 AT GG AG C T C T GAG C C T C T GAT G AAG AG T A 4 57 



RESULT 7 

BE357860 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BE357860 695 bp mRNA EST 20-JUL-2000 

DG1_22_C12 . gl_A002 Dark Grown 1 (DG1) Sorghum bicolor cDNA, mRNA 
sequence . 
BE357860 

BE357860.1 GI:9299417 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Sorghum. 



REFERENCE 
AUTHORS 

TITLE ■ 
JOURNAL 
COMMENT 



Gingle,A., Marsala,C, Sudman f M. and Pratt 



FEATURES 

source 



BASE COUNT 
ORIGIN 



1 (bases 1 to 695) 
Cordonnier-Pratt , M. -M. 
,L.H. 

An EST database from Sorghum: dark-grown seedlings 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 54 
High quality sequence stop: 693 
POLYA-No . 

Location/Qualifiers 
1. .695 

/organism="Sorghum bicolor" 
/db_xref="taxon:4 558" 
/clone_lib="Dark Grown 1 ( DG1 ) " 

/note="Organ : 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site_l: Xhol; Site_2: EcoRI; The library was 
made from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 
121 a 197 c 247 g 130 t 



Query Match 41.4%; Score 179; DB 10; Length 695; 

Best Local Similarity 78.8%; Pred. No. 2.4e-33; 

Matches 212; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I 1 I I I I I I I I I I I I I I I II II II II I I I I I I I I I I I I I I I I III I I I I I I 
Db 28 6 GACGGCAAGGCCGAGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCC 34 5 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I III II 
Db 34 6 GGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATC 4 05 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I II I I I I I I I I II I I I I II I I I I I I I I I I I I I I I 

Db 4 06 GACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATG 4 65 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 2 80 

III I I I I I I I I I I I I I I I I I I I I I I I I I I III III I I I I I I I I I I I I I I II I 
Db 4 66 CCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTG 525 

Qy 281 aagaggctctgaaaacctcatggatcgaa 309 

II I I I! I I I I I I I I I II 

Db 52 6 AT G GAG C T C T GAG C C T C T GAT G AAG AGT A 554 



RESULT 8 

AW922289 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AW922289 535 bp mRNA EST 19-JUL-2000 

DG1_17_H09 .gl_A002 Dark Grown 1 (DG1) Sorghum bicolor cDNA, mRNA 
sequence . 
AW922289 

AW922289.1 GI:8088114 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; ■ Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 535) 

Cordonnier-Pratt,M. -M. , Gingle,A., Marsala, C, Sudman,M. and Pratt 
,L.H. 

An EST database from Sorghum: dark-grown seedlings 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building/ Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 

High quality sequence start: 10 

High quality sequence stop: 511 

POLYA=No. 

Location/Qualifiers 
1. .535 

/organism="Sorghum bicolor" 
/db_xref="taxon: 4558" 
/clone_lib="Dark Grown 1 (DG1) " 

/note="Organ: 5-day-old dark-grown seedlings; Vector: 
Lambda Zap; Site_l: Xhol; Site_2: EcoRI; The library was 
made from poly-A RNA in the cloning vector lambda ZAP II. 
Clones to be sequenced were prepared by mass excision." 
109 a 113 c 163 g 150 t 



Query Match 39.2%; Score 169.2; DB 10; Length 535; 

Best Local Similarity 78.5%; Pred. No. 5.6e-31; 

Matches 201; Conservative 0; Mismatches 55; Indels 0; Gaps 0; 

Qy 54 aggggcccctgctgatccctttcgggatggggcggcccaattgccccggggaaacgctcg 113 

MINI I I I I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I 
Db 1 AGGGGCGGCTGCTGATGCCGTTCGGGATGGGGCGGCGCAAGTGCCCCGGGGAGACGCTCG 60 

Qy 114 cgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttcgactgggacacgg 173 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I III I I I I I I I I I I I I I 
Db 61 CGCTGCGGACCGTCGGGCTGGTGCTCGGCACGCTGATCCAGTGCATCGACTGGGACAGAG 120 



Qy 17 4 ttgatggagctcaggtttgacatgaagctancggcgggctgaccatgccccgggccgtcc 233 

I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 121 TCGATGGCCTGGAGATTGACATGACCGCGGGTGGCGGGCTGACCATGCCCAGGGCCGTCC 180 



Qy 234 cgttggaggccatgtgcangccgcgtacagctatgcgtggtgttcttaagaggctctgaa 293 

I I I I I I I I I I I I Mill III III II I I I I I I I I I MINI I I I I I I I I I 
Db 181 CGTTGGAGGCCACGTGCAAGCCTCGTGCAGCTATGCGCGATGTTCTGATGGAGCTCTGAG 240 



Qy 294 aacctcatggatcgaa 309 

I I I I I I II 
Db 241 CCTCTGATGAAGAGTA 256 



RESULT 9 
AI668207/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



AI668207 584 bp mRNA EST 02-FEB-2000 

605018C02.xl 605 - Endosperm cDNA library from Schmidt lab Zea mays 
cDNA, mRNA sequence. 
AI668207 

AI668207.1 GI:4827515 
EST. 

Zea mays. 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Zea. 
1 (bases 1 to 584) 
Walbot,V. 

Maize ESTs from various cDNA libraries sequenced at Stanford 
University 
Unpublished (1999) 
Contact: Walbot V 

Department of Biological Sciences 
Stanford University 

855 California Ave, Palo Alto, CA 94304, USA 

Tel: 650 723 2227 

Fax: 650 725 8221 

Email : walbot@stanford.edu 

Plate: 605018 row: C column: 02. 

Location/Qualifiers 

1. .584 

/organism=" Zea mays" 
/cultivar="Ohio43" 
/db_xref="taxon: 4577" 

/clone_lib="605 - Endosperm cDNA library from Schmidt lab" 
/tissue_jtype="nucellar, embryo, and endosperm" 
/dev_stage="10-14 days post-pollination" 
/lab_host="DH5 (alpha) " 

/note="Organ: Kernel; Vector: pAD-GAL4-2'; Site_l: EcoRI; 
Site_2: Xhol; Kernel endosperm cDNA library from Schmidt 
lab" 

111 a 188 c 183 g 102 t 



Query Match 

Best Local Similarity 



37.7%; 
77.5%; 



Score 163; DB 10; Length 584; 
Pred. No. 1.8e-29; 



Matches 196; Conservative 0; Mismatches 57; Indels 0; Gaps 0; 



Qy 40 cggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccc 99 

II I I I I I I I I I I I I I I I III II I II I I I I I I I I I I I I I I I II I I I I I 
Db 391 CGACGGCAAGGCCGAGGGCCGGCTGATGCTGCCGTTCGGGATGGGACGGCGCAGGTGCCC 332 

Qy 100 cggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgctt 159 

I I I I I I II I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I II I I I I I I 
Db 331 CGGGGAGACACTCGCGCTGCGGACCGCCGGCCTCGTGCTCGCCACGCTCATCCAGTGCTT 272 

Qy 160 cgactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccat 219 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 271 CCACTGGGACAGGATCGATGGCGCTGAGATCGACATGACCGAGAGCGGCGGGCTCACCAT 212 

Qy 220 gccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttct 279 

I I I II I I I I I I I I I I I I II I I I I I I I I I I I III II III I II I I I II I I I I 
Db 211 GCCCCGGGCCGTCCCGTTGGAGGCCACCTGCAAGCCTCGCGAAGCCATGCGTCATGTTCT 152 



Qy 280 taagaggctctga 292 

III I I I II I I 
Db 151 TCAGCAGCTCTGA 139 



RESULT 10 
BG320973/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



BG320973 790 bp mRNA EST 27-FEB-2001 

Zm04_02d03_A Zm04_AAFC_ECORC_cold_stressedjnr\ai ze_seedlings Zea mays 
cDNA clone Zm04_02d03, mRNA sequence. 
BG320973 

BG320973.1 GI:13150651 
EST. 

Zea mays . 
Zea mays 

Eukaryota; Viridiplantae ; Streptophyta ; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Zea. 
1 (bases 1 to 790) 

Singh,J.A., Wakui,K., Couroux,P., De Moors, A., Harris, L. J., Hattori 
,J.I., Ouellet,T., Robert, L.S., Sprott,D. and Tinker, N. A. 
Expressed Sequence Tags from Cold-Stressed Maize Seedlings 
Unpublished (2001) 
Contact: Singh, J. A. 

Eastern Cereal and Oilseed Research Centre 
Agriculture and Agri-food Canada 

960 Carling Avenue, Bldg. 20, Ottawa, Ontario, K1A 0C6, Canada 
Tel: (613) 759-1662 
Fax: (613) 759-1701 
Email : singh ja@em. agr . ca . 

Location/Qualifiers 

1. .790 

/organism="Zea mays" 
/cultivar="C0328" 
/db_xref="taxon: 4577" 
/clone=" Zm04_02d03" 

/clone_lib=" Zm04_AAFC_ECORC_cold_stressed_maize_seedlings" 
/tissue_type="Leaf , crown" 

/note="Vector : Bluescript SK-/XhoI-EcoRI ; Site_l: Eco RI; 



Site_2: Xho I; Lower temperature 5o C / hour from 22 to 
12oC; bring to 5o in 1 hour from 12oC. Leave at 5oC 2 days 
, photoperiod 16 hours. Light intensity was 125 uE-1. 
Library prepared by in vivo mass excision from amplified 
library . " 

BASE COUNT 145 a 256 c 241 g 143 t 5 others 

ORIGIN 



Query Match 34.8%; 
Best Local Similarity 76.8%; 
Matches 195; Conservative 



Score 150.4; DB 11; 
Pred. No. 1.9e-26; 
0; Mismatches 58; 



Length 790; 
Indels 1; 



Gaps 



Qy 40 cggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgccc 99 

II I I I I I I I I I I I I I I I III II I II I I I I I I I I I I I I I I I II I I I I I 
Db 393 CGACGGCAAGGCCGAGGGCCGGCTGATGCTGCCGTTCGGGATGGGACGGCGCAGGTGCCC 334 



i; 



Qy 
Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



100 cggggaaacg-ctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgct 158 

I I I I I I III I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I II I I I I I 
333 CGGGGAGACGCCTCGCTCTGCGGACCGCCGGCCTCGTGCTCGCCACGCTCATCCAGTGCT 27 4 

159 tcgactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgacca 218 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

27 3 TCCACTGGGACAGAATCGATGGCGCTGAGATCGACATGACCGAGAGCGGCGGGCTCACCA 214 

219 tgccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttc 278 

I I I I I M I I I I I I I II I I I I I M I I I I I I I I III II III I I I I I I I I I I I 
213 TGCCCCGGGCCGTCCCGTTGGAGGCCACCTGCAAGCCTCGCGAAGCCATGCGTCATGTTC 154 

279 ttaagaggctctga 292 

I I I I I I I I I I I 
153 TTCAGCAGCTCTGA 140 



RESULT 11 

BG464759 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BG464759 357 bp mRNA EST 20-MAR-2001 

EMl_33_G05.gl_A002 Embryo 1 (EMI) Sorghum bicolor cDNA, mRNA 
sequence . 
BG464759 

BG464759.1 GI:13393586 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae ; Sorghum. 
1 (bases 1 to 357) 

Reid,S.P., Cordonnier-Pratt, M. -M. , Gingle,A. and Pratt, L.H. 

An EST database from Sorghum: developing embryos 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 



FEATURES 

source 



to exclude PolyA, vector and regions 
threshold for highest quality sequence 



; 101 
228 



BASE COUNT 
ORIGIN 



Email : mmpratt@uga . edu 
Sequences have been trimmed 
below Phred quality 16. The 
is 20. 

Seq primer: PolyTMix 
High quality sequence start: 
High quality sequence stop: 
POLYA=No. 

Location/Qualifiers 
1. .357 

/organism="Sorghum bicolor" 
/db_xref="taxon:4 558" 
/clone_lib="Embryo 1 (EMI)" 

/note="Organ : Embryos germinated for 24 hr; Vector: 
pBluescript II from Lambda Zap II; Site_l: Xhol; Site_2: 
EcoRI; The library was made from poly-A RNA in the cloning 
vector lambda ZAP II. Clones to be sequenced were 
prepared by mass excision." 
63 a 86 c 127 g 79 t 2 others 



Query Match 34.8%; Score 150.2; DB 11; Length 357; 

Best Local Similarity 66.7%; Pred. No. 2.2e-26; 

Matches 226; Conservative 0; Mismatches 112; Indels 1; Gaps 1; 

Qy 78 ggatggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgc 137 

III I I I I I 1 I I III I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I II I 
Db 1 GGAAGGGGCGGCGCAAGTGCCCCGGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGC 60 

Qy 138 tcgcaacgctgctcaattgcttcgactgggacacggttgatggagctcaggtttgacatg 197 

III I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 61 TCGGCACGCTGATCCAGTGCATCGACTGGGACAGAGTCGANGGCCTGGAGATTGACATGA 120 

Qy 198 aagctancggcgggctgaccatgccccgggccgtcccgttggaggccatgtgcangccgc 257 

II I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III I 
Db 121 CCGCGGGTGGCGGGCTGACCATGCCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTC 180 

Qy 258 gtacagctatgcgtggtgttcttaagaggctctgaaaacctcatggatcgaattgctggc 317 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 181 GTGCAGCTATGCGCGATGTTCTGATGGAGCTCTGAGCCTCTATGAAGAGTACATCTTGGC 24 0 

Qy 318 -atcgtctgaagggtgtatgacgtagcttccgagttccgagcatatatattcacttgcct 37 6 

II II I I I I I I I I I I I I I I I I III II 

Db 241 AATGATCCCTAGGGTCTCACTGCGTGGTACTGAGGTTCAACCGGTACTAGTGTGTAGGTG 300 

Qy 377 tgtaacaatttaattttcgccgattgtatggaatggatt 415 

I I I I I I I I I I I I I I II I I I I I I 

Db 301 TGTAGCAGTANTGCTTTGGCTTATGGTGTGTGCTGAACT 339 



RESULT 12 
BG464902 

LOCUS BG4 64 902 293 bp mRNA EST 20-MAR-2001 

DEFINITION EMl_35_G06.gl_A002 Embryo 1 (EMI) Sorghum bicolor cDNA, mRNA 

sequence . 
ACCESSION BG464902 



VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BASE COUNT 
ORIGIN 



BG464902.1 GI:13393837 
EST. 

sorghum. 
Sorghum bicolor 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta ; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; PACC 
clade; Panicoideae; Andropogoneae; Sorghum. 
1 (bases 1 to 293) 

Reid, S.P., Cordonnier-Pratt,M.-M. , Gingle,A. and Pratt, L.H. 

An EST database from Sorghum: developing embryos 

Unpublished (2000) 

Contact: Cordonnier-Pratt MM 

Department of Botany 

The University of Georgia 

Plant Sciences Building, Rm. 2502, Athens, GA 30602-7271, USA 
Tel: 706 542 1860 
Fax: 706 542 1805 
Email: mmpratt@uga.edu 

Sequences have been trimmed to exclude PolyA, vector and regions 
below Phred quality 16. The threshold for highest quality sequence 
is 20. 

Seq primer: PolyTMix 

High quality sequence start: 7 

High quality sequence stop: 219 

POLYA=No. 

Location/Qualifiers 
1. .293 

/organism="Sorghum bicolor" 
/db_xref="taxon: 4558" 
/clone_lib="Embryo 1 (EMI)" 

/note="Organ : Embryos germinated for 24 hr; Vector: 
pBluescript II from Lambda Zap II; Site_l: Xhol; Site_2: 
EcoRI; The library was made from poly-A RNA in the cloning 
vector lambda ZAP II. Clones to be sequenced were 
prepared by mass excision." 
51 a 79 c 104 g 58 t 1 others 



Query Match 34.4%; Score 148.6; DB 11; Length 293; 

Best Local Similarity 77.4%; Pred. No. 5.5e-26; 

Matches 178; Conservative 0; Mismatches 52; Indels 0; Gaps 0; 

Qy 80 atggggcggcccaattgccccggggaaacgctcgcgctgcggaccgtcgggctggtgctc 139 

I I I I I I I I I I III I I I I I ! I ! I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGGGGCGGCGCAAGTGCCCCGGGGAGACGCTCGCGCTGCGGACCGTCGGGCTGGTGCTC 60 

Qy 140 gcaacgctgctcaattgcttcgactgggacacggttgatggagctcaggtttgacatgaa 199 

I I J II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 61 GGCACGCTGATCCAGTGCATCGACTGGGACAGAGTCGATGGCCTGGAGATTGACATGACC 120 

Qy 200 gctancggcgggctgaccatgccccgggccgtcccgttggaggccatgtgcangccgcgt 259 

II I I I I M I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I III III 

Db 121 GCGGGTGGCGGGCTGACCATGCCCAGGGCCGTCCCGTTGGAGGCCACGTGCAAGCCTCGT 180 

Qy 2 60 acagctatgcgtggtgttcttaagaggctctgaaaacctcatggatcgaa 309 
I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II 



Db 181 GCAGCTATGCGCGATGTTCTGATGGAGCTCTGAGCCTCTGANGAAGAGTA 230 



RESULT 13 
BE704790/C 

LOCUS BE704790 803 bp mRNA EST 12-SEP-2000 

DEFINITION Sc02_02elO_A Sc02_AAFC_ECORC_cold_st ressed_winter_rye_seedlings 

Secale cereale cDNA clone Sc02_02el0, mRNA sequence. 
ACCESSION BE704790 

VERSION BE704790.1 GI:10093055 

KEYWORDS EST. 
SOURCE rye. 

ORGANISM Secale cereale 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta ; Tracheophyta / 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; Pooideae 
; Triticeae; Secale. 
REFERENCE 1 (bases 1 to 803) 

AUTHORS Singh, J. A., Piche,C, Couroux,P., De Moors, A., Harris, L. J., Hattori 

,J.I., Ouellet,T., Robert, L.S., Sprott,D. and Tinker, N. A. 
TITLE Expressed Sequence Tags from Cold-Stressed Winter Rye Seedlings 

JOURNAL Unpublished (2000) 
COMMENT Contact: Singh, J. A. 

Eastern Cereal and Oilseed Research Centre 
Agriculture and Agri-food Canada 

960 Carling Avenue, Bldg. 20, Ottawa, Ontario, K1A 0C6, Canada 
Tel: (613) 759-1662 
Fax: (613) 759-1701 
Email: singh j a@em . agr . ca . 
FEATURES Location /Qualifiers 

source 1. .803, 

/organism=" Secale cereale" 

/cultivar="Puma (winter rye)" 

/db_xref="taxon: 4550" 

/clone="Sc02_02elO" 

/clone_lib="Sc02_AAFC_ECORC__cold_stressed_winter_rye_seedl 
ings" 

/tissue_type="leaf , crown" 
/dev_stage= n seedling three-leaf stage" 

/note="Vector : Bluescript SK+/XhoI-EcoRI ; Site_l: Eco RI; 
Site_2: Xho I; Sampled three-leaf seedlings treated for 
one week at 2oC, 12 hrs light/day. Library made with 
Stratagene UNI ZAP XR Kit/ (not packaged). cDNA is directly 
ligated into SK+/XhoI-EcoRI , then electroporated into 
TOP10 cells (Invitrogen) . " 

BASE COUNT 157 a 279 c 221 g . 143 t 3 others 

ORIGIN 



Query Match 33.8%; Score 145.8; DB 10; Length 803; 

Best Local Similarity 72.4%; Pred. No. 2.4e-25; 

Matches 186; Conservative 1; Mismatches 70; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 94 GACGGCAAGGCCGAGGGGCGGTTCATGATCCCGTTCGGGATGGGCCGGCGGCGGTGCCCC 4 35 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 



Db 4 34 GGGGAGAMGCTGGCGCTGCGGACCATCGGCATGGTGCTGGCCACGCTGGTGCAGTGCTTT 37 5 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I II I 

Db 374 GACTGGGAGCGCGTGGATGGCGCGGAGGTGGACATGACGGAGGGCGGCGGGCTCACCATC 315 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I 
Db 314 CCCAAGGCCATGCCGCTTGAGGCCGTGTGCAGGCCGCGCACGGCCATGCGCGACGTGCTT 255 



Qy 281 aagaggctctgaaaacc 297 

I I I I I I I I II II 
Db 254 CAGAGCCTCTGATGGCC 238 



RESULT 14 
AL503532/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



. mRNA sequence . 
GI : 12029747 



AL503532 700 bp mRNA EST 04-JAN-2001 

AL503532 Hordeum vulgare Barke roots Hordeum vulgare cDNA clone 
HW02H20T 5' 
AL503532 
AL503532. 1 
EST. 
barley. 

Hordeum vulgare 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta; 
Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae; Pooideae 
; Triticeae; Hordeum. 
1 (bases 1 to 700) 

Michalek,W., Weschke,W., Pleissner, K . -P . and Graner,A. 
EST sequencing and analysis in barley 
Unpublished (2000) 
Contact: Michalek W 

Institute for Plant Genetics and Crop Plant Research 
.Corrensstr . 3 , D-06466 Gatersleben, Germany 

Email : michalek@ipk-gatersleben . de, http : //pgrc . ipk-gatersleben . de 
Seq primer: T3 primer for 5'end. 

Location/Qualif iers 

1. .700 

/organisnv="Hordeum vulgare" 
/cult ivar=" Barke" 
/db_xref="taxon: 4513" 
/clone="HW02H20T" 

/clone_lib="Hordeum vulgare Barke roots" 

/tissue_t ype=" roots" 

/lab_host="XLOLR" 

/note="Vector : plasmid pBK-CMV; Site_l: EcoRI; Site_2 : 
Xhol; mRNA was made from roots of spring barley variety 
'Barke 1 , a high quality malting variety. Roots were grown 
for two days on filter paper at room temperature Cloning 
sites: EcoRI (5' -end of cDNA) and Xhol (3 f -end of cDNA) . 
NOTE: Due to a cloning artefact caused by the kit, in most 
cases the EcoRI site is NOT present, as well as the EcoRI 
adapter. Average insert size is 1 kb Sequence trimming: 
Vector sequences and sequence ends were trimmed from the 
5 1 -and 3 f -end until a 50 bp window contains less than two 



ambiguities. The maximum length was set to 700 bp" 
BASE COUNT 132 a 229 c 196 g 137 t 6 others 

ORIGIN 



Query Match 33.0%; Score 142.4; DB 10; Length 700; 

Best Local Similarity 72.2%; Pred. No. 1.6e-24; 

Matches 182; Conservative 0; Mismatches 70; Indels 0; Gaps 0; 

Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100- 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 47 6 GACGGCAAGGCCGAGGGGCGGTTCATGATCCCGTTCGGGATGGGCCGCCGGCGGTGCCCC 417 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 416 GGGGANACGCTGGCGCTGCGGACCATCGGCATGGTGCTGGCCACGCTGGTGCAGTGCTTC 357 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I 

Db 356 GACTGGGACCGCGTCGACGGCAAGGAGGTGGACATGACGGAGAGCGGCGGGCTCACCATC 2 97 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 96 CCCAAGGCCGTGCCGCTCGAGGCCGTNTGCAGGCCGCGCCCGGCCATGCGCGACGTGCTC 237 

Qy 281 aagaggctctga 292 

I I I I I I II I I 
Db 23 6 CAGAGCCTCTGA 225 



RESULT 15 
BE412662/C 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
COMMENT 



BE412662 702 bp mRNA EST 24-JUL-2000 

MCG007 . D10R990625 ITEC MCG Barley Leaf /Culm Library Hordeum vulgare 
cDNA clone MCG007.D10, mRNA sequence. 
BE412662 

BE412662.1 GI:9410620 

EST. 

barley . 

Hordeum vulgare 

Eukaryota; Viridiplantae ; Streptophyta; Embryophyta ; Tracheophyta ; 
Spermatophyta; Magnoliophyta ; Liliopsida; Poales; Poaceae; Pooideae 
; Triticeae; Hordeum. 
1 (bases 1 to 702) 

Anderson, 0. A. , Appels,R., Bailey, P., Blake, T., Close, T., Cloutier 
,S., Dubcovsky, J. , Feuillet,C, Gale,M., Graner,A., Gustaf son, P . , 
Herrmann, R. G. , Holton,T., Jacquemin, J . M . , Jia,J., Joudrier,P., 
Langridge, P . , Lazo,G.R., Lin, J.J. , McGuire,P., Ogihara,Y., 
Pecchioni, N. , Qualset,C, Schuch,W., Selvaraj,G., Sharif lou, M . , 
Sorrells,M., Warburton,M. and Wenzel,G. 

International Triticeae EST Cooperative (ITEC) : Production of 
Expressed Sequence Tags for Species of the Triticeae 
Unpublished (2000) 
Contact: Graner A 

Institute for Plant Genetics & Crop Plant Research 
Corrensstr. 3, D-06466 Gatersleben GERMANY 
Tel: 49 39482 5521 



FEATURES 

source 



BASE COUNT 
ORIGIN 



Fax: 49 39482* 5137 

Email : a__graner@ipk-gatersleben . de 
International Triticeae EST Cooperative (ITEC) 
http : / /wheat . pw . usda . gov/genome . 

Location/Qualif iers 

1. .702 

/organism="Hordeum vulgare" 

/db_xref="taxon:4513" 

/clone="MCG007.D10" 

/clone_lib="ITEC MCG Barley Leaf/Culm Library" 
/tissue_type=" leaf /culm" 
/dev_stage= "etiolated" 
132 a 229 c 198 g 137 t 6 others 



Query Match 33.0%; 
Best Local Similarity 72.2%; 
Matches 182; Conservative 



Score 142.4; DB 10; Length 702; 
Pred. No. 1.6e-24; 
0; Mismatches 70; Indels 0; 



Gaps 



Qy 41 ggcggcaaggccaaggggcccctgctgatccctttcgggatggggcggcccaattgcccc 100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I 

Db 47 6 GACGGCAAGGCCGAGGGGCGGTTCATGATCCCGTTCGGGATGGGCCGCCGGCGGTGCCCC 417 

Qy 101 ggggaaacgctcgcgctgcggaccgtcgggctggtgctcgcaacgctgctcaattgcttc 160 

I I I I I Mill I I I I I I I I I I I I I I II I I I I I II II I I I I I I I I I I I I I I 
Db 416 GGGGANACGCTGGCGCTGCGGACCATCGGCATGGTGCTGGCCACGCTGGTGCAGTGCTTC 357 

Qy 161 gactgggacacggttgatggagctcaggtttgacatgaagctancggcgggctgaccatg 220 

I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 35 6 GACTGGGACCGCGTCGACGGCAAGGAGGTGGACATGACGGAGAGCGGCGGGCTCACCATC 297 

Qy 221 ccccgggccgtcccgttggaggccatgtgcangccgcgtacagctatgcgtggtgttctt 280 

III I I I I I I I M I I II I i I I I I I I I I I I I I I II Mill I I I I I 
Db 2 96 CCCAAGGCCGTGCCGCTCGAGGCCGTNTGCAGGCCGCGCCCGGCCATGCGCGACGTGCTC 237 



Qy 281 aagaggctctga 292 

I I II I I I I II 
Db 23 6 CAGAGCCTCTGA 22 5 



Search completed: February 7, 2002, 08:20:54 
Job time: 18131 sec 



